R 中自定义函数的 运行 遗传算法错误
Error in in running genetic algorithm for a custom function in R
目标
我想估算智能驾驶员跟车模型 (IDM) 的 'best' 参数。 'best' 指的是那些在观察速度和预测速度之间产生最小均方根误差的参数。下面显示了一个可重现的示例,我成功地使用网格搜索找到了最佳参数,但我在 运行 宁遗传算法中没有成功。
IDM 功能
R 中的以下 IDM 函数接受 6 个参数并输出 3 列的数据帧,加速度 a_i
、速度 v_i
和距离 g_x_i
:
calculate_IDM <- function(A_i,
v_0,
g_0,
g_t_i,
b_i,
small_delta){
####################
## Allocate Vectors
####################
# acceleration rate
a_i <- rep(NA_real_, time_length)
# speed
v_i <- rep(NA_real_, time_length)
# position
x_i <- rep(NA_real_, time_length)
# spacing
g_x_i <- rep(NA_real_, time_length)
# speed difference
delta_v_i <- rep(NA_real_, time_length)
# desired spacing
g_star_i <- rep(NA_real_, time_length)
##########################################
## Initial values for Following vehicle
##########################################
# speed
v_i[1] <- v_i_first
# position
x_i[1] <- x_i_first
# spacing
g_x_i[1] <- xn1_first - x_i_first
# speed difference
delta_v_i[1] <- v_i_first - vn1_first
# desired spacing
g_star_i[1] <- g_0 + max(0, (v_i[1] * g_t_i) + ((v_i[1] * delta_v_i[1]) / (2 * sqrt((A_i * b_i)))))
# acceleration rate
a_i[1] <- A_i * (1 - ((v_i[1] / v_0)^small_delta) - ((g_star_i[1] / g_x_i[1])^2))
# a_i[1] <- ifelse(is.nan(a_i[1]), A_i, a_i[1])
# speed
v_i[2] <- v_i[1] + (a_i[1] * time_frame)
### if the speed is negative, make it zero
v_i[2] <- ifelse(v_i[2] < 0, 0, v_i[2])
# position
x_i[2] <- x_i[1] + (v_i[1] * time_frame) + (0.5 * a_i[1] * (time_frame)^2)
# spacing
g_x_i[2] <- xn1_complete[2] - x_i[2]
# speed difference
delta_v_i[2] <- v_i[2] - vn1_complete[2]
####################
## IDM Calculations
####################
for (t in 2:(time_length-1)) {
# desired spacing
g_star_i[t] <- g_0 + max(0, (v_i[t] * g_t_i) + ((v_i[t] * delta_v_i[t]) / (2 * sqrt((A_i * b_i)))))
# acceleration rate
a_i[t] <- A_i * (1 - ((v_i[t] / v_0)^small_delta) - ((g_star_i[t] / g_x_i[t])^2))
# a_i[t] <- ifelse(is.nan(a_i[t]), A_i, a_i[t])
# speed
v_i[t+1] <- v_i[t] + (a_i[t] * time_frame)
### if the speed is negative, make it zero
v_i[t+1] <- ifelse(v_i[t+1] < 0, 0, v_i[t+1])
# position
x_i[t+1] <- x_i[t] + (v_i[t] * time_frame) + (0.5 * a_i[t] * (time_frame)^2)
# spacing
g_x_i[t+1] <- xn1_complete[t+1] - x_i[t+1]
# speed difference
delta_v_i[t+1] <- v_i[t+1] - vn1_complete[t+1]
}
data.frame(a_i, v_i, g_x_i)
}
运行一组参数的函数
要运行上述函数,需要前车速度和时间向量:
# Time
last_time <- 300 ## s
time_frame <- 0.1 ## s
Time <- seq(from = 0, to = last_time, by = time_frame)
time_length <- length(Time)
v_i_first <- 0
x_i_first <- 5
## Lead vehicle
vn1_first <- 0 ## first speed m/s
xn1_first <- 100 ## position of lead vehicle front center m
bn1_complete <- c(rep(x = 4, length.out = time_length-2951),
rep(x = 0, length.out = time_length-50)) ## acceleration rate
#############################################
### Complete speed trajectory of Lead vehicle
#############################################
vn1_complete <- rep(NA_real_, time_length) ### an empty vector
xn1_complete <- rep(NA_real_, time_length) ### an empty vector
vn1_complete[1] <- vn1_first
xn1_complete[1] <- xn1_first
for (t in 2:time_length) {
### Lead vehicle calculations
vn1_complete[t] <- vn1_complete[t-1] + (bn1_complete[t-1] * time_frame)
xn1_complete[t] <- xn1_complete[t-1] + (vn1_complete[t-1] * time_frame) + (0.5 * bn1_complete[t-1] * (time_frame)^2)
}
现在,我可以应用函数了:
## one example
dff <- calculate_IDM(4, 30, 6.5, 1, 4, 2)
head(dff)
a_i v_i g_x_i
1 3.981274 0.0000000 95.00000
2 3.978206 0.3981274 95.00009
3 3.973594 0.7959480 95.00039
4 3.967446 1.1933075 95.00093
5 3.959771 1.5900521 95.00176
6 3.950581 1.9860292 95.00296
使用网格搜索找到最佳参数:
观察到的速度和参数列表如下:
library(tidyverse)
obs_data <- tibble(
obs_v_i = dff$v_i
)
# Parameters list
parameters_grid <- list(
A_i = c(2, 4),
v_0 = c(27, 30),
g_0 = c(6.5, 7),
g_t_i = c(1, 3),
b_i = c(4, 5),
small_delta = c(2, 3)
) %>%
expand.grid() %>%
as_tibble()
适应度函数和2个例子如下:
# Fitness function
fitness_func <- function(obs_data,
A_i,
v_0,
g_0,
g_t_i,
b_i,
small_delta) {
df <- cbind(obs_data, calculate_IDM(A_i,
v_0,
g_0,
g_t_i,
b_i,
small_delta))
sqrt(sum((df$obs_v_i - df$v_i)^2)/nrow(df))
}
> fitness_func(obs_data, 4, 30, 6.5, 1, 4, 2)
[1] 0
> fitness_func(obs_data, 2, 27, 7, 3, 5, 3)
[1] 1.406937
现在我可以使用 dplyr
中的 rowwise()
函数进行网格搜索:
parameters_grid %>%
rowwise() %>%
mutate(RMSE = fitness_func(obs_data,
A_i,
v_0,
g_0,
g_t_i,
b_i,
small_delta))
# A tibble: 64 x 7
# Rowwise:
A_i v_0 g_0 g_t_i b_i small_delta RMSE
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2 27 6.5 1 4 2 1.68
2 4 27 6.5 1 4 2 0.213
3 2 30 6.5 1 4 2 1.65
4 4 30 6.5 1 4 2 0
5 2 27 7 1 4 2 1.68
6 4 27 7 1 4 2 0.218
7 2 30 7 1 4 2 1.65
8 4 30 7 1 4 2 0.00794
9 2 27 6.5 3 4 2 1.57
10 4 27 6.5 3 4 2 0.814
# ... with 54 more rows
遗传算法错误
你可以想象,如果参数列表越大,计算时间就会显着增加。所以,我想运行遗传算法。使用示例 here,我尝试使用 GA
库来估计参数但出现错误:
library(GA)
GA <- ga(type = "real-valued",
fitness = -fitness_func(obs_data,
A_i,
v_0,
g_0,
g_t_i,
b_i,
small_delta),
lower = c(2, 27, 6.5, 1, 4, 2), upper = c(4, 30, 7, 3, 5, 3),
popSize = 5, maxiter = 10, run = 10)
Error in calculate_IDM(A_i, v_0, g_0, g_t_i, b_i, small_delta) :
object 'g_0' not found
请让我知道我做错了什么。
?ga
中记录的 fitness
是
the fitness function, any allowable R function which takes as input an individual string representing a potential solution, and returns a numerical value describing its “fitness”.
所以,我们可以将它包装成一个带有两个参数的函数,然后使用 fitness_func
个参数作为 x[1]
、x[2]
、...、x[6]
将与 lower
和 upper
绑定值的长度相同。这里,我们也可以单独传data
library(GA)
GA <- ga(type = "real-valued",
fitness = function(dat, x) {-fitness_func(dat,
x[1],
x[2],
x[3],
x[4],
x[5],
x[6])},
dat = obs_data,
lower = c(2, 27, 6.5, 1, 4, 2), upper = c(4, 30, 7, 3, 5, 3),
popSize = 5, maxiter = 1000, run = 100)
#GA | iter = 1 | Mean = -0.5668704 | Best = -0.3523867
#GA | iter = 2 | Mean = -0.3762976 | Best = -0.3523867
#GA | iter = 3 | Mean = -0.3529940 | Best = -0.3523867
#GA | iter = 4 | Mean = -0.3523867 | Best = -0.3523867
#GA | iter = 5 | Mean = -0.3523867 | Best = -0.3523867
#GA | iter = 6 | Mean = -0.3523867 | Best = -0.3523867
#GA | iter = 7 | Mean = -0.3523867 | Best = -0.3523867
#GA | iter = 8 | Mean = -0.3640060 | Best = -0.3523867
#...
#GA | iter = 519 | Mean = -0.08506463 | Best = -0.08505393
#GA | iter = 520 | Mean = -0.08505440 | Best = -0.08505393
#GA | iter = 521 | Mean = -0.14507196 | Best = -0.08505393
#GA | iter = 522 | Mean = -0.08505393 | Best = -0.08505393
#GA | iter = 523 | Mean = -0.08505393 | Best = -0.08505393
#GA | iter = 524 | Mean = -0.11238973 | Best = -0.08505393
#GA | iter = 525 | Mean = -0.31888465 | Best = -0.08505393
#GA | iter = 526 | Mean = -0.09641056 | Best = -0.08505393
虽然最后有一个警告
目标
我想估算智能驾驶员跟车模型 (IDM) 的 'best' 参数。 'best' 指的是那些在观察速度和预测速度之间产生最小均方根误差的参数。下面显示了一个可重现的示例,我成功地使用网格搜索找到了最佳参数,但我在 运行 宁遗传算法中没有成功。
IDM 功能
R 中的以下 IDM 函数接受 6 个参数并输出 3 列的数据帧,加速度 a_i
、速度 v_i
和距离 g_x_i
:
calculate_IDM <- function(A_i,
v_0,
g_0,
g_t_i,
b_i,
small_delta){
####################
## Allocate Vectors
####################
# acceleration rate
a_i <- rep(NA_real_, time_length)
# speed
v_i <- rep(NA_real_, time_length)
# position
x_i <- rep(NA_real_, time_length)
# spacing
g_x_i <- rep(NA_real_, time_length)
# speed difference
delta_v_i <- rep(NA_real_, time_length)
# desired spacing
g_star_i <- rep(NA_real_, time_length)
##########################################
## Initial values for Following vehicle
##########################################
# speed
v_i[1] <- v_i_first
# position
x_i[1] <- x_i_first
# spacing
g_x_i[1] <- xn1_first - x_i_first
# speed difference
delta_v_i[1] <- v_i_first - vn1_first
# desired spacing
g_star_i[1] <- g_0 + max(0, (v_i[1] * g_t_i) + ((v_i[1] * delta_v_i[1]) / (2 * sqrt((A_i * b_i)))))
# acceleration rate
a_i[1] <- A_i * (1 - ((v_i[1] / v_0)^small_delta) - ((g_star_i[1] / g_x_i[1])^2))
# a_i[1] <- ifelse(is.nan(a_i[1]), A_i, a_i[1])
# speed
v_i[2] <- v_i[1] + (a_i[1] * time_frame)
### if the speed is negative, make it zero
v_i[2] <- ifelse(v_i[2] < 0, 0, v_i[2])
# position
x_i[2] <- x_i[1] + (v_i[1] * time_frame) + (0.5 * a_i[1] * (time_frame)^2)
# spacing
g_x_i[2] <- xn1_complete[2] - x_i[2]
# speed difference
delta_v_i[2] <- v_i[2] - vn1_complete[2]
####################
## IDM Calculations
####################
for (t in 2:(time_length-1)) {
# desired spacing
g_star_i[t] <- g_0 + max(0, (v_i[t] * g_t_i) + ((v_i[t] * delta_v_i[t]) / (2 * sqrt((A_i * b_i)))))
# acceleration rate
a_i[t] <- A_i * (1 - ((v_i[t] / v_0)^small_delta) - ((g_star_i[t] / g_x_i[t])^2))
# a_i[t] <- ifelse(is.nan(a_i[t]), A_i, a_i[t])
# speed
v_i[t+1] <- v_i[t] + (a_i[t] * time_frame)
### if the speed is negative, make it zero
v_i[t+1] <- ifelse(v_i[t+1] < 0, 0, v_i[t+1])
# position
x_i[t+1] <- x_i[t] + (v_i[t] * time_frame) + (0.5 * a_i[t] * (time_frame)^2)
# spacing
g_x_i[t+1] <- xn1_complete[t+1] - x_i[t+1]
# speed difference
delta_v_i[t+1] <- v_i[t+1] - vn1_complete[t+1]
}
data.frame(a_i, v_i, g_x_i)
}
运行一组参数的函数
要运行上述函数,需要前车速度和时间向量:
# Time
last_time <- 300 ## s
time_frame <- 0.1 ## s
Time <- seq(from = 0, to = last_time, by = time_frame)
time_length <- length(Time)
v_i_first <- 0
x_i_first <- 5
## Lead vehicle
vn1_first <- 0 ## first speed m/s
xn1_first <- 100 ## position of lead vehicle front center m
bn1_complete <- c(rep(x = 4, length.out = time_length-2951),
rep(x = 0, length.out = time_length-50)) ## acceleration rate
#############################################
### Complete speed trajectory of Lead vehicle
#############################################
vn1_complete <- rep(NA_real_, time_length) ### an empty vector
xn1_complete <- rep(NA_real_, time_length) ### an empty vector
vn1_complete[1] <- vn1_first
xn1_complete[1] <- xn1_first
for (t in 2:time_length) {
### Lead vehicle calculations
vn1_complete[t] <- vn1_complete[t-1] + (bn1_complete[t-1] * time_frame)
xn1_complete[t] <- xn1_complete[t-1] + (vn1_complete[t-1] * time_frame) + (0.5 * bn1_complete[t-1] * (time_frame)^2)
}
现在,我可以应用函数了:
## one example
dff <- calculate_IDM(4, 30, 6.5, 1, 4, 2)
head(dff)
a_i v_i g_x_i
1 3.981274 0.0000000 95.00000
2 3.978206 0.3981274 95.00009
3 3.973594 0.7959480 95.00039
4 3.967446 1.1933075 95.00093
5 3.959771 1.5900521 95.00176
6 3.950581 1.9860292 95.00296
使用网格搜索找到最佳参数:
观察到的速度和参数列表如下:
library(tidyverse)
obs_data <- tibble(
obs_v_i = dff$v_i
)
# Parameters list
parameters_grid <- list(
A_i = c(2, 4),
v_0 = c(27, 30),
g_0 = c(6.5, 7),
g_t_i = c(1, 3),
b_i = c(4, 5),
small_delta = c(2, 3)
) %>%
expand.grid() %>%
as_tibble()
适应度函数和2个例子如下:
# Fitness function
fitness_func <- function(obs_data,
A_i,
v_0,
g_0,
g_t_i,
b_i,
small_delta) {
df <- cbind(obs_data, calculate_IDM(A_i,
v_0,
g_0,
g_t_i,
b_i,
small_delta))
sqrt(sum((df$obs_v_i - df$v_i)^2)/nrow(df))
}
> fitness_func(obs_data, 4, 30, 6.5, 1, 4, 2)
[1] 0
> fitness_func(obs_data, 2, 27, 7, 3, 5, 3)
[1] 1.406937
现在我可以使用 dplyr
中的 rowwise()
函数进行网格搜索:
parameters_grid %>%
rowwise() %>%
mutate(RMSE = fitness_func(obs_data,
A_i,
v_0,
g_0,
g_t_i,
b_i,
small_delta))
# A tibble: 64 x 7
# Rowwise:
A_i v_0 g_0 g_t_i b_i small_delta RMSE
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2 27 6.5 1 4 2 1.68
2 4 27 6.5 1 4 2 0.213
3 2 30 6.5 1 4 2 1.65
4 4 30 6.5 1 4 2 0
5 2 27 7 1 4 2 1.68
6 4 27 7 1 4 2 0.218
7 2 30 7 1 4 2 1.65
8 4 30 7 1 4 2 0.00794
9 2 27 6.5 3 4 2 1.57
10 4 27 6.5 3 4 2 0.814
# ... with 54 more rows
遗传算法错误
你可以想象,如果参数列表越大,计算时间就会显着增加。所以,我想运行遗传算法。使用示例 here,我尝试使用 GA
库来估计参数但出现错误:
library(GA)
GA <- ga(type = "real-valued",
fitness = -fitness_func(obs_data,
A_i,
v_0,
g_0,
g_t_i,
b_i,
small_delta),
lower = c(2, 27, 6.5, 1, 4, 2), upper = c(4, 30, 7, 3, 5, 3),
popSize = 5, maxiter = 10, run = 10)
Error in calculate_IDM(A_i, v_0, g_0, g_t_i, b_i, small_delta) :
object 'g_0' not found
请让我知道我做错了什么。
?ga
中记录的 fitness
是
the fitness function, any allowable R function which takes as input an individual string representing a potential solution, and returns a numerical value describing its “fitness”.
所以,我们可以将它包装成一个带有两个参数的函数,然后使用 fitness_func
个参数作为 x[1]
、x[2]
、...、x[6]
将与 lower
和 upper
绑定值的长度相同。这里,我们也可以单独传data
library(GA)
GA <- ga(type = "real-valued",
fitness = function(dat, x) {-fitness_func(dat,
x[1],
x[2],
x[3],
x[4],
x[5],
x[6])},
dat = obs_data,
lower = c(2, 27, 6.5, 1, 4, 2), upper = c(4, 30, 7, 3, 5, 3),
popSize = 5, maxiter = 1000, run = 100)
#GA | iter = 1 | Mean = -0.5668704 | Best = -0.3523867
#GA | iter = 2 | Mean = -0.3762976 | Best = -0.3523867
#GA | iter = 3 | Mean = -0.3529940 | Best = -0.3523867
#GA | iter = 4 | Mean = -0.3523867 | Best = -0.3523867
#GA | iter = 5 | Mean = -0.3523867 | Best = -0.3523867
#GA | iter = 6 | Mean = -0.3523867 | Best = -0.3523867
#GA | iter = 7 | Mean = -0.3523867 | Best = -0.3523867
#GA | iter = 8 | Mean = -0.3640060 | Best = -0.3523867
#...
#GA | iter = 519 | Mean = -0.08506463 | Best = -0.08505393
#GA | iter = 520 | Mean = -0.08505440 | Best = -0.08505393
#GA | iter = 521 | Mean = -0.14507196 | Best = -0.08505393
#GA | iter = 522 | Mean = -0.08505393 | Best = -0.08505393
#GA | iter = 523 | Mean = -0.08505393 | Best = -0.08505393
#GA | iter = 524 | Mean = -0.11238973 | Best = -0.08505393
#GA | iter = 525 | Mean = -0.31888465 | Best = -0.08505393
#GA | iter = 526 | Mean = -0.09641056 | Best = -0.08505393
虽然最后有一个警告