遗传算法（GA）select一个向量的最优n个值

Question

我必须选择一个向量的 10 个元素来最大化函数。由于向量很长，因此有很多可能性（~1000 选择 10）来计算它们。所以我开始研究 GA 包以使用遗传算法。

我想到了这个 MWE：

values <- 1:1000


# Fitness function which I want to maximise
f <- function(x){
  # Choose values
  y <- values[x]
  
  # From the first 10 sum up the odd values. 
  y <- ifelse(y %% 2 != 0, y, 0) 
  y <- y[1:10]
  return(sum(y))
}

# Maximum value of f for this example
y <- ifelse(values %% 2 != 0, values, 0) 
sum(sort(y, decreasing = TRUE)[1:10])
# [1] 9900 

# Genetic algorithm
GA <- ga(type = "permutation", fitness = f, lower = rep(1, 10), upper = rep(1000, 10), maxiter = 100)
summary(GA)

结果有点平淡。从 summary(GA)，我感觉算法总是排列所有 1000 个值（解从 x1 到 x1000），这导致优化效率低下。我如何告诉算法它应该只使用 10 个值（所以解决方案是 x1 .. x10）？

Answer 1

你应该阅读 https://www.jstatsoft.org/article/view/v053i04。您没有排列问题，但 selection one 因此您应该使用二进制类型的遗传算法。因为你想 select 排他地 10（10 个 1 和 990 个零）你应该编写你自己的遗传算子因为这是默认算子几乎永远不会满足的约束（包括 -Inf 在健身如果你有超过 10 个零，则函数）。一种方法：

人口（k告诉你想要多少）：

myInit <- function(k){
  
  function(GA){
    m <- matrix(0, ncol = GA@nBits, nrow = GA@popSize)
    
    for(i in seq_len(GA@popSize))
      m[i, sample(GA@nBits, k)] <- 1 
  
    m
  }
}

跨界

myCrossover <- function(GA, parents){
  
  parents <- GA@population[parents,] %>%
    apply(1, function(x) which(x == 1)) %>%
    t()

  parents_diff <- list("vector", 2)
  parents_diff[[1]] <- setdiff(parents[2,], parents[1,])
  parents_diff[[2]] <- setdiff(parents[1,], parents[2,])
  
  children_ind <- list("vector", 2)
  for(i in 1:2){
    k <- length(parents_diff[[i]])
    change_k <- sample(k, sample(ceiling(k/2), 1))
    children_ind[[i]] <- if(length(change_k) > 0){
      c(parents[i, -change_k], parents_diff[[i]][change_k])
    } else {
      parents[i,]
    }
  }
  
  children <- matrix(0, nrow = 2, ncol = GA@nBits)
  for(i in 1:2)
    children[i, children_ind[[i]]] <- 1
  
  list(children = children, fitness = c(NA, NA))
}

变异

myMutation <- function(GA, parent){
  
  ind <- which(GA@population[parent,] == 1) 
  n_change <- sample(3, 1)
  ind[sample(length(ind), n_change)] <- sample(setdiff(seq_len(GA@nBits), ind), n_change)
  parent <- integer(GA@nBits)
  parent[ind] <- 1
  
  parent
}

Fitness（您的函数适用于二进制 GA）：

f <- function(x, values){
  
  ind <- which(x == 1)
  y <- values[ind]
  y <- ifelse(y %% 2 != 0, y, 0) 
  y <- y[1:10]
  return(sum(y))
}

GA:

GA <- ga(
  type = "binary", 
  fitness = f, 
  values = values,
  nBits = length(values),
  population = myInit(10),
  crossover = myCrossover,
  mutation = myMutation,
  run = 300,
  pmutation = 0.3,
  maxiter = 10000,
  popSize = 100
)

选择的值

values[which(GA@solution[1,] == 1)]

遗传算法（GA）select一个向量的最优n个值

Genetic algorythm (GA) to select the optimal n values of a vector

r

genetic-algorithm