在遗传算法的适应度函数中使用嵌套 for 循环使其速度太慢
Using nested for loops in Fitness Function in Genetic Algroithims makes it too slow
我正在尝试通过 "GA" 包来使用遗传算法,但在制作适应度函数时遇到了问题,我正在使用 GA 来模拟我的数据并为我的模型中的常数获取最合适的值。
我的数据来自对车速和其他参数的观察,假设我有一辆车,它开了 2 趟车,我想为它制作一个模型。
每次旅行都有多个列(速度,与对面汽车的速度增量,以及两辆车之间的距离),所以我必须取每次旅行的第一行并将其传递给适应度函数中的方程式,然后方程式将生成速度、增量速度和范围的新结果,然后我必须使用新值并生成其他值,然后将模拟距离与我在数据中的旧范围进行比较,这是观察到的,并得到最低的差异由 GA。
首先:这是我的数据。
https://drive.google.com/open?id=1923Jl6pDnQa_tGAluANUfIWCcyf85YVq
其次:这是我的适应度函数和 GA
Fitness_Function <- function(data, M_Acc, D_Speed, Beta, Com_Acc, Gap_J, D_Time){
Trips_IDs <- sort(unique(data$FileName))
# Trip=1;ROW=1
Calibrated_DF <- data.frame()
for (Trip in 1:2) {
Trip_Data <- data%>%filter(FileName==Trips_IDs[Trip])
attach(Trip_Data, warn.conflicts=F)
for (ROW in 1:(nrow(Trip_Data)-1)) {
if (ROW==1) {
speed <- Filling_Speed[1]
Delta_V <- Filling_DeltaVelocity[1]
Dist <- Filling_Range[1]
# M_Acc = 0.8418 ;D_Speed =29.2 ;Beta = 3.52
# Com_Acc = 0.8150 ;Gap_J = 1.554 ;D_Time = 0.878
Distance <- speed*D_Time - (speed*Delta_V)/(2*sqrt(M_Acc*Com_Acc))
if (Distance < 0 ) {
Distance <- 0
}
D_Gap <- Gap_J + Distance
Acceleration <- M_Acc*(1-(speed/D_Speed)^Beta-(D_Gap/Dist)^2)
}else{
speed <- speed_C
Delta_V <- Delta_V_C
Dist <- Dist_c
Distance <- speed*D_Time - (speed*Delta_V)/(2*sqrt(M_Acc*Com_Acc))
if (is.na(Distance)) {
}
Distance = 0
if (Distance < 0 ) {
Distance <- 0
}
D_Gap <- Gap_J + Distance
Acceleration <- M_Acc*(1-(speed/D_Speed)^Beta-(D_Gap/Dist)^2)
}
Lead_Veh_Speed_F <- Filling_Speed[ROW+1]+Filling_DeltaVelocity[ROW+1]
speed_C <- speed + Acceleration*0.1
Delta_V_C <- Lead_Veh_Speed_F-speed_C
Dist_c <- Dist+(Delta_V_C+Delta_V)/2*0.1
Calibrated_DF <- rbind(Calibrated_DF,c(Dist_c,ROW+1,Trips_IDs[Trip],Trip_Data$Filling_Range[ROW+1]))
}
detach(Trip_Data)
}
colnames(Calibrated_DF) <- c("C_Distance","row","Trip","Actual_Distance")
Calibrated_DF$Dif <- (Calibrated_DF$C_Distance-Calibrated_DF$Actual_Distance)^2
RMSPE <- sqrt(sum(Calibrated_DF$Dif)/sum(Calibrated_DF$Actual_Distance^2))
return(RMSPE)
# return(Calibrated_DF)
}
GA_Test <- ga(type='real-valued', lower=c( 0.1 , 1 , 1 , 0.1 , 0.1 , 0.1 ),
upper=c( 5 , 40 , 40 , 5 , 10 , 5 ), popSize=300, maxiter=300,run = 100,
keepBest=T, names = c("M_Acc", "D_Speed", "Beta", "Com_Acc", "Gap_J", "D_Time"),
fitness = function(b) -Fitness_Function(data, b[1],b[2], b[3],b[4],b[5],b[6]))
我的问题是:代码非常大,即使进行一次迭代也非常慢,我尝试使用 dplyr 而不是使用 for 循环,但使用 dplyr 无法做到这一点,因为我必须计算距离,然后计算加速度,然后计算速度,然后为其他行再次计算它们,但我无法用 dplyr 来做到这一点。
我会 post 我在这里使用 Dplyr 的测试代码,但它不完整,因为我无法完成它。
所以请帮忙。
data <- data%>%group_by(Driver,FileName)%>%
mutate(Distance_Term = ifelse(row_number()==1,Speed_C*D_Time - (Speed_C*Delta_V_C)/(2*sqrt(M_Acc*Com_Acc)),0))
data <- data%>%mutate(Distance_Term = ifelse(Distance_Term < 0 , 0, Distance_Term))%>%
mutate(D_Gap = Gap_J + Distance_Term,Acceleration_C = M_Acc*(1-(Speed_C/D_Speed)^Beta-(D_Gap/Distance)^2))
注意:行程ID中的FileName栏也是我的PC资质很好,所以问题不在我的PC
我在 purrr
中用 accumulate2
函数更改了 for 循环,因此它更快更有效,我从这个问题
得到了这个答案
Objective_Function <- function(data, M_Acc, D_Speed, Beta, Com_Acc, Gap_J, D_Time){
myfun <- function(list, lcs,lcs2){
ds <- lcs - list[[1]]
Distance <- list[[1]]*D_Time - (list[[1]] * ds) / (2*sqrt(M_Acc*Com_Acc))
if (Distance < 0|is.na(Distance)) {Distance <- 0}
gap <- Gap_J + Distance
acc <- M_Acc * (1 - (list[[1]] / D_Speed)^Beta - (gap / list[[2]])^2)
fcs_new <- list[[1]] + acc * 0.1
ds_new <- lcs2- fcs_new
di_new <- list[[2]]+(ds_new+ds)/2*0.1
return(list(Speed = fcs_new,Distance = di_new))
}
Generated_Data <- data %>%group_by(Driver,FileName)%>%
mutate(Speed_Distance_Calibrated = accumulate2( .init = list(Filling_Speed[1],
Filling_Range[1]),.x = Lead_Veh_Speed_F,.y = Lead_Veh_Speed_F2, myfun)[-1])%>%ungroup()
Generated_Data <- Generated_Data %>% group_by(Driver,FileName)%>%
mutate(Speed_Distance_Calibrated = append(list(list(Speed = Filling_Speed[1],Distance = Filling_Range[1])),Speed_Distance_Calibrated[-length(Speed_Distance_Calibrated)]))%>%ungroup()
Dif <- map_df(Generated_Data$Speed_Distance_Calibrated, `[`, 2)
Generated_Data <- Generated_Data %>% mutate(Dif_sq = (Dif$Distance - Generated_Data$Filling_Range)^2)
RMSPE <- sqrt(sum(Generated_Data$Dif_sq)/sum(Generated_Data$Filling_Range^2))
return(RMSPE)
}
GA_Test <- ga(type='real-valued', lower=c( 0.1 , 1 , 1 , 0.1 , 0.1 , 0.1 ),
upper=c( 5 , 40 , 40 , 5 , 10 , 5 ), popSize=300, maxiter=300,run = 100,
keepBest=T, names = c("M_Acc", "D_Speed", "Beta", "Com_Acc", "Gap_J", "D_Time"),
fitness = function(b) -Objective_Function(data, b[1],b[2], b[3],b[4],b[5],b[6]),parallel = TRUE)
Summary <- summary(GA_Test)
我正在尝试通过 "GA" 包来使用遗传算法,但在制作适应度函数时遇到了问题,我正在使用 GA 来模拟我的数据并为我的模型中的常数获取最合适的值。
我的数据来自对车速和其他参数的观察,假设我有一辆车,它开了 2 趟车,我想为它制作一个模型。 每次旅行都有多个列(速度,与对面汽车的速度增量,以及两辆车之间的距离),所以我必须取每次旅行的第一行并将其传递给适应度函数中的方程式,然后方程式将生成速度、增量速度和范围的新结果,然后我必须使用新值并生成其他值,然后将模拟距离与我在数据中的旧范围进行比较,这是观察到的,并得到最低的差异由 GA。
首先:这是我的数据。 https://drive.google.com/open?id=1923Jl6pDnQa_tGAluANUfIWCcyf85YVq
其次:这是我的适应度函数和 GA
Fitness_Function <- function(data, M_Acc, D_Speed, Beta, Com_Acc, Gap_J, D_Time){
Trips_IDs <- sort(unique(data$FileName))
# Trip=1;ROW=1
Calibrated_DF <- data.frame()
for (Trip in 1:2) {
Trip_Data <- data%>%filter(FileName==Trips_IDs[Trip])
attach(Trip_Data, warn.conflicts=F)
for (ROW in 1:(nrow(Trip_Data)-1)) {
if (ROW==1) {
speed <- Filling_Speed[1]
Delta_V <- Filling_DeltaVelocity[1]
Dist <- Filling_Range[1]
# M_Acc = 0.8418 ;D_Speed =29.2 ;Beta = 3.52
# Com_Acc = 0.8150 ;Gap_J = 1.554 ;D_Time = 0.878
Distance <- speed*D_Time - (speed*Delta_V)/(2*sqrt(M_Acc*Com_Acc))
if (Distance < 0 ) {
Distance <- 0
}
D_Gap <- Gap_J + Distance
Acceleration <- M_Acc*(1-(speed/D_Speed)^Beta-(D_Gap/Dist)^2)
}else{
speed <- speed_C
Delta_V <- Delta_V_C
Dist <- Dist_c
Distance <- speed*D_Time - (speed*Delta_V)/(2*sqrt(M_Acc*Com_Acc))
if (is.na(Distance)) {
}
Distance = 0
if (Distance < 0 ) {
Distance <- 0
}
D_Gap <- Gap_J + Distance
Acceleration <- M_Acc*(1-(speed/D_Speed)^Beta-(D_Gap/Dist)^2)
}
Lead_Veh_Speed_F <- Filling_Speed[ROW+1]+Filling_DeltaVelocity[ROW+1]
speed_C <- speed + Acceleration*0.1
Delta_V_C <- Lead_Veh_Speed_F-speed_C
Dist_c <- Dist+(Delta_V_C+Delta_V)/2*0.1
Calibrated_DF <- rbind(Calibrated_DF,c(Dist_c,ROW+1,Trips_IDs[Trip],Trip_Data$Filling_Range[ROW+1]))
}
detach(Trip_Data)
}
colnames(Calibrated_DF) <- c("C_Distance","row","Trip","Actual_Distance")
Calibrated_DF$Dif <- (Calibrated_DF$C_Distance-Calibrated_DF$Actual_Distance)^2
RMSPE <- sqrt(sum(Calibrated_DF$Dif)/sum(Calibrated_DF$Actual_Distance^2))
return(RMSPE)
# return(Calibrated_DF)
}
GA_Test <- ga(type='real-valued', lower=c( 0.1 , 1 , 1 , 0.1 , 0.1 , 0.1 ),
upper=c( 5 , 40 , 40 , 5 , 10 , 5 ), popSize=300, maxiter=300,run = 100,
keepBest=T, names = c("M_Acc", "D_Speed", "Beta", "Com_Acc", "Gap_J", "D_Time"),
fitness = function(b) -Fitness_Function(data, b[1],b[2], b[3],b[4],b[5],b[6]))
我的问题是:代码非常大,即使进行一次迭代也非常慢,我尝试使用 dplyr 而不是使用 for 循环,但使用 dplyr 无法做到这一点,因为我必须计算距离,然后计算加速度,然后计算速度,然后为其他行再次计算它们,但我无法用 dplyr 来做到这一点。 我会 post 我在这里使用 Dplyr 的测试代码,但它不完整,因为我无法完成它。
所以请帮忙。
data <- data%>%group_by(Driver,FileName)%>%
mutate(Distance_Term = ifelse(row_number()==1,Speed_C*D_Time - (Speed_C*Delta_V_C)/(2*sqrt(M_Acc*Com_Acc)),0))
data <- data%>%mutate(Distance_Term = ifelse(Distance_Term < 0 , 0, Distance_Term))%>%
mutate(D_Gap = Gap_J + Distance_Term,Acceleration_C = M_Acc*(1-(Speed_C/D_Speed)^Beta-(D_Gap/Distance)^2))
注意:行程ID中的FileName栏也是我的PC资质很好,所以问题不在我的PC
我在 purrr
中用 accumulate2
函数更改了 for 循环,因此它更快更有效,我从这个问题
Objective_Function <- function(data, M_Acc, D_Speed, Beta, Com_Acc, Gap_J, D_Time){
myfun <- function(list, lcs,lcs2){
ds <- lcs - list[[1]]
Distance <- list[[1]]*D_Time - (list[[1]] * ds) / (2*sqrt(M_Acc*Com_Acc))
if (Distance < 0|is.na(Distance)) {Distance <- 0}
gap <- Gap_J + Distance
acc <- M_Acc * (1 - (list[[1]] / D_Speed)^Beta - (gap / list[[2]])^2)
fcs_new <- list[[1]] + acc * 0.1
ds_new <- lcs2- fcs_new
di_new <- list[[2]]+(ds_new+ds)/2*0.1
return(list(Speed = fcs_new,Distance = di_new))
}
Generated_Data <- data %>%group_by(Driver,FileName)%>%
mutate(Speed_Distance_Calibrated = accumulate2( .init = list(Filling_Speed[1],
Filling_Range[1]),.x = Lead_Veh_Speed_F,.y = Lead_Veh_Speed_F2, myfun)[-1])%>%ungroup()
Generated_Data <- Generated_Data %>% group_by(Driver,FileName)%>%
mutate(Speed_Distance_Calibrated = append(list(list(Speed = Filling_Speed[1],Distance = Filling_Range[1])),Speed_Distance_Calibrated[-length(Speed_Distance_Calibrated)]))%>%ungroup()
Dif <- map_df(Generated_Data$Speed_Distance_Calibrated, `[`, 2)
Generated_Data <- Generated_Data %>% mutate(Dif_sq = (Dif$Distance - Generated_Data$Filling_Range)^2)
RMSPE <- sqrt(sum(Generated_Data$Dif_sq)/sum(Generated_Data$Filling_Range^2))
return(RMSPE)
}
GA_Test <- ga(type='real-valued', lower=c( 0.1 , 1 , 1 , 0.1 , 0.1 , 0.1 ),
upper=c( 5 , 40 , 40 , 5 , 10 , 5 ), popSize=300, maxiter=300,run = 100,
keepBest=T, names = c("M_Acc", "D_Speed", "Beta", "Com_Acc", "Gap_J", "D_Time"),
fitness = function(b) -Objective_Function(data, b[1],b[2], b[3],b[4],b[5],b[6]),parallel = TRUE)
Summary <- summary(GA_Test)