使用 lpSolve 在整数规划中实现额外的约束变量
Implementing additional constraint variables in integer programming using lpSolve
我正在努力实现一个 lpSolve 解决方案来优化一个假设的日常梦幻棒球问题。我在应用最后一个约束时遇到问题:
- 位置 - 恰好 3 个外野手 (OF) 2 个投手 (P) 和 1 个其他
- 成本 - 成本低于 200
- 团队 - 任何一个团队的最大人数为 6
- 团队 - 名单上的最少团队数为 3**
例如,假设您有一个包含 1000 名球员的数据框,其中包含积分、成本、位置和团队,并且您正在尝试最大化平均积分:
library(tidyverse)
library(lpSolve)
set.seed(123)
df <- data_frame(avg_points = sample(5:45,1000, replace = T),
cost = sample(3:45,1000, replace = T),
position = sample(c("P","C","1B","2B","3B","SS","OF"),1000, replace = T),
team = sample(LETTERS,1000, replace = T)) %>% mutate(id = row_number())
head(df)
# A tibble: 6 x 5
# avg_points cost position team id
# <int> <int> <chr> <chr> <int>
#1 17 13 2B Y 1
#2 39 45 1B P 2
#3 29 33 1B C 3
#4 38 31 2B V 4
#5 17 13 P A 5
#6 10 6 SS V 6
我已经使用以下代码实现了前 3 个约束,但我无法弄清楚如何实现花名册上最少数量的团队。我想我需要向模型添加额外的变量,但我不确定该怎么做。
#set the objective function (what we want to maximize)
obj <- df$avg_points
# set the constraint rows.
con <- rbind(t(model.matrix(~ position + 0,df)), cost = df$cost, t(model.matrix(~ team + 0, df)) )
#set the constraint values
rhs <- c(1,1,1,1,3,2,1, # 1. #exactly 3 outfielders 2 pitchers and 1 of everything else
200, # 2. at a cost less than 200
rep(6,26) # 3. max number from any team is 6
)
#set the direction of the constraints
dir <- c("=","=","=","=","=","=","=","<=",rep("<=",26))
result <- lp("max",obj,con,dir,rhs,all.bin = TRUE)
如果有帮助,我正在尝试复制 This paper (with minor tweaks) which has corresponding julia code here
这可能是您问题的解决方案。
这是我用过的数据(和你的一样):
library(tidyverse)
library(lpSolve)
N <- 1000
set.seed(123)
df <- tibble(avg_points = sample(5:45,N, replace = T),
cost = sample(3:45,N, replace = T),
position = sample(c("P","C","1B","2B","3B","SS","OF"),N, replace = T),
team = sample(LETTERS,N, replace = T)) %>%
mutate(id = row_number())
您想找到 x1...xn
最大化下面的 objective 函数:
x1 * average_points1 + x2 * average_points1 + ... + xn * average_pointsn
根据 lpSolve 的工作方式,您需要将每个 LHS
表示为总和
x1...xn
乘以您提供的向量。
由于你现在的变量无法表达球队的数量,你可以引入新的(我将它们称为y1..yn_teams
和z1..zn_teams
):
# number of teams:
n_teams = length(unique(df$team))
您的新 objective 函数(ys 和 zs 不会影响您的整体 objective 函数,因为常量设置为 0):
obj <- c(df$avg_points, rep(0, 2 * n_teams))
)
前 3 个约束相同,但为 y
和 z
添加了常量:
c1 <- t(model.matrix(~ position + 0,df))
c1 <- cbind(c1,
matrix(0, ncol = 2 * n_teams, nrow = nrow(c1)))
c2 = df$cost
c2 <- c(c2, rep(0, 2 * n_teams))
c3 = t(model.matrix(~ team + 0, df))
c3 <- cbind(c3, matrix(0, ncol = 2 * n_teams, nrow = nrow(c3)))
因为你想要至少有3支球队,你会先用y
来计算每支球队的球员人数:
此约束计算每支球队的球员人数。您将所选球队的所有球员相加,然后减去每个球队相应的 y
变量。这应该等于 0。(diag()
创建单位矩阵,此时我们不担心 z
):
# should be x1...xn - y1...n = 0
c4_1 <- cbind(t(model.matrix(~team + 0, df)), # x
-diag(n_teams), # y
matrix(0, ncol = n_teams, nrow = n_teams) # z
) # == 0
由于每个 y
现在是一个团队中的球员人数,您现在可以确保 z 是具有此约束的二进制:
c4_2 <- cbind(t(model.matrix(~ team + 0, df)), # x1+...+xn ==
-diag(n_teams), # - (y1+...+yn )
diag(n_teams) # z binary
) # <= 1
这是确保至少选择 3 个团队的约束条件:
c4_3 <- c(rep(0, nrow(df) + n_teams), # x and y
rep(1, n_teams) # z >= 3
)
您需要确保
您可以使用 big-M 方法来创建约束,即:
或者,在更 lpSolve
友好的版本中:
在这种情况下,您可以使用 6
作为 M
的值,因为它是任何 y
可以采用的最大值:
c4_4 <- cbind(matrix(0, nrow = n_teams, ncol = nrow(df)),
diag(n_teams),
-diag(n_teams) * 6)
添加此约束以确保所有 x
都是二进制的:
#all x binary
c5 <- cbind(diag(nrow(df)), # x
matrix(0, ncol = 2 * n_teams, nrow = nrow(df)) # y + z
)
创建新的约束矩阵
con <- rbind(c1,
c2,
c3,
c4_1,
c4_2,
c4_3,
c4_4,
c5)
#set the constraint values
rhs <- c(1,1,1,1,3,2,1, # 1. #exactly 3 outfielders 2 pitchers and 1 of everything else
200, # 2. at a cost less than 200
rep(6, n_teams), # 3. max number from any team is 6
rep(0, n_teams), # c4_1
rep(1, n_teams), # c4_2
3, # c4_3,
rep(0, n_teams), #c4_4
rep(1, nrow(df))# c5 binary
)
#set the direction of the constraints
dir <- c(rep("==", 7), # c1
"<=", # c2
rep("<=", n_teams), # c3
rep('==', n_teams), # c4_1
rep('<=', n_teams), # c4_2
'>=', # c4_3
rep('<=', n_teams), # c4_4
rep('<=', nrow(df)) # c5
)
问题几乎相同,但我使用 all.int
而不是 all.bin
来确保计数适用于团队中的球员:
result <- lp("max",obj,con,dir,rhs,all.int = TRUE)
Success: the objective function is 450
roster <- df[result$solution[1:nrow(df)] == 1, ]
roster
# A tibble: 10 x 5
avg_points cost position team id
<int> <int> <chr> <chr> <int>
1 45 19 C I 24
2 45 5 P X 126
3 45 25 OF N 139
4 45 22 3B J 193
5 45 24 2B B 327
6 45 25 OF P 340
7 45 23 P Q 356
8 45 13 OF N 400
9 45 13 SS L 401
10 45 45 1B G 614
如果您将数据更改为
N <- 1000
set.seed(123)
df <- tibble(avg_points = sample(5:45,N, replace = T),
cost = sample(3:45,N, replace = T),
position = sample(c("P","C","1B","2B","3B","SS","OF"),N, replace = T),
team = sample(c("A", "B"),N, replace = T)) %>%
mutate(id = row_number())
现在不可行,因为数据中的团队数量少于 3。
您可以检查它现在是否有效:
sort(unique(df$team))[result$solution[1027:1052]==1]
[1] "B" "E" "I" "J" "N" "P" "Q" "X"
sort(unique(roster$team))
[1] "B" "E" "I" "J" "N" "P" "Q" "X"
我正在努力实现一个 lpSolve 解决方案来优化一个假设的日常梦幻棒球问题。我在应用最后一个约束时遇到问题:
- 位置 - 恰好 3 个外野手 (OF) 2 个投手 (P) 和 1 个其他
- 成本 - 成本低于 200
- 团队 - 任何一个团队的最大人数为 6
- 团队 - 名单上的最少团队数为 3**
例如,假设您有一个包含 1000 名球员的数据框,其中包含积分、成本、位置和团队,并且您正在尝试最大化平均积分:
library(tidyverse)
library(lpSolve)
set.seed(123)
df <- data_frame(avg_points = sample(5:45,1000, replace = T),
cost = sample(3:45,1000, replace = T),
position = sample(c("P","C","1B","2B","3B","SS","OF"),1000, replace = T),
team = sample(LETTERS,1000, replace = T)) %>% mutate(id = row_number())
head(df)
# A tibble: 6 x 5
# avg_points cost position team id
# <int> <int> <chr> <chr> <int>
#1 17 13 2B Y 1
#2 39 45 1B P 2
#3 29 33 1B C 3
#4 38 31 2B V 4
#5 17 13 P A 5
#6 10 6 SS V 6
我已经使用以下代码实现了前 3 个约束,但我无法弄清楚如何实现花名册上最少数量的团队。我想我需要向模型添加额外的变量,但我不确定该怎么做。
#set the objective function (what we want to maximize)
obj <- df$avg_points
# set the constraint rows.
con <- rbind(t(model.matrix(~ position + 0,df)), cost = df$cost, t(model.matrix(~ team + 0, df)) )
#set the constraint values
rhs <- c(1,1,1,1,3,2,1, # 1. #exactly 3 outfielders 2 pitchers and 1 of everything else
200, # 2. at a cost less than 200
rep(6,26) # 3. max number from any team is 6
)
#set the direction of the constraints
dir <- c("=","=","=","=","=","=","=","<=",rep("<=",26))
result <- lp("max",obj,con,dir,rhs,all.bin = TRUE)
如果有帮助,我正在尝试复制 This paper (with minor tweaks) which has corresponding julia code here
这可能是您问题的解决方案。
这是我用过的数据(和你的一样):
library(tidyverse)
library(lpSolve)
N <- 1000
set.seed(123)
df <- tibble(avg_points = sample(5:45,N, replace = T),
cost = sample(3:45,N, replace = T),
position = sample(c("P","C","1B","2B","3B","SS","OF"),N, replace = T),
team = sample(LETTERS,N, replace = T)) %>%
mutate(id = row_number())
您想找到 x1...xn
最大化下面的 objective 函数:
x1 * average_points1 + x2 * average_points1 + ... + xn * average_pointsn
根据 lpSolve 的工作方式,您需要将每个 LHS
表示为总和
x1...xn
乘以您提供的向量。
由于你现在的变量无法表达球队的数量,你可以引入新的(我将它们称为y1..yn_teams
和z1..zn_teams
):
# number of teams:
n_teams = length(unique(df$team))
您的新 objective 函数(ys 和 zs 不会影响您的整体 objective 函数,因为常量设置为 0):
obj <- c(df$avg_points, rep(0, 2 * n_teams))
)
前 3 个约束相同,但为 y
和 z
添加了常量:
c1 <- t(model.matrix(~ position + 0,df))
c1 <- cbind(c1,
matrix(0, ncol = 2 * n_teams, nrow = nrow(c1)))
c2 = df$cost
c2 <- c(c2, rep(0, 2 * n_teams))
c3 = t(model.matrix(~ team + 0, df))
c3 <- cbind(c3, matrix(0, ncol = 2 * n_teams, nrow = nrow(c3)))
因为你想要至少有3支球队,你会先用y
来计算每支球队的球员人数:
此约束计算每支球队的球员人数。您将所选球队的所有球员相加,然后减去每个球队相应的 y
变量。这应该等于 0。(diag()
创建单位矩阵,此时我们不担心 z
):
# should be x1...xn - y1...n = 0
c4_1 <- cbind(t(model.matrix(~team + 0, df)), # x
-diag(n_teams), # y
matrix(0, ncol = n_teams, nrow = n_teams) # z
) # == 0
由于每个 y
现在是一个团队中的球员人数,您现在可以确保 z 是具有此约束的二进制:
c4_2 <- cbind(t(model.matrix(~ team + 0, df)), # x1+...+xn ==
-diag(n_teams), # - (y1+...+yn )
diag(n_teams) # z binary
) # <= 1
这是确保至少选择 3 个团队的约束条件:
c4_3 <- c(rep(0, nrow(df) + n_teams), # x and y
rep(1, n_teams) # z >= 3
)
您需要确保
您可以使用 big-M 方法来创建约束,即:
或者,在更 lpSolve
友好的版本中:
在这种情况下,您可以使用 6
作为 M
的值,因为它是任何 y
可以采用的最大值:
c4_4 <- cbind(matrix(0, nrow = n_teams, ncol = nrow(df)),
diag(n_teams),
-diag(n_teams) * 6)
添加此约束以确保所有 x
都是二进制的:
#all x binary
c5 <- cbind(diag(nrow(df)), # x
matrix(0, ncol = 2 * n_teams, nrow = nrow(df)) # y + z
)
创建新的约束矩阵
con <- rbind(c1,
c2,
c3,
c4_1,
c4_2,
c4_3,
c4_4,
c5)
#set the constraint values
rhs <- c(1,1,1,1,3,2,1, # 1. #exactly 3 outfielders 2 pitchers and 1 of everything else
200, # 2. at a cost less than 200
rep(6, n_teams), # 3. max number from any team is 6
rep(0, n_teams), # c4_1
rep(1, n_teams), # c4_2
3, # c4_3,
rep(0, n_teams), #c4_4
rep(1, nrow(df))# c5 binary
)
#set the direction of the constraints
dir <- c(rep("==", 7), # c1
"<=", # c2
rep("<=", n_teams), # c3
rep('==', n_teams), # c4_1
rep('<=', n_teams), # c4_2
'>=', # c4_3
rep('<=', n_teams), # c4_4
rep('<=', nrow(df)) # c5
)
问题几乎相同,但我使用 all.int
而不是 all.bin
来确保计数适用于团队中的球员:
result <- lp("max",obj,con,dir,rhs,all.int = TRUE)
Success: the objective function is 450
roster <- df[result$solution[1:nrow(df)] == 1, ]
roster
# A tibble: 10 x 5
avg_points cost position team id
<int> <int> <chr> <chr> <int>
1 45 19 C I 24
2 45 5 P X 126
3 45 25 OF N 139
4 45 22 3B J 193
5 45 24 2B B 327
6 45 25 OF P 340
7 45 23 P Q 356
8 45 13 OF N 400
9 45 13 SS L 401
10 45 45 1B G 614
如果您将数据更改为
N <- 1000
set.seed(123)
df <- tibble(avg_points = sample(5:45,N, replace = T),
cost = sample(3:45,N, replace = T),
position = sample(c("P","C","1B","2B","3B","SS","OF"),N, replace = T),
team = sample(c("A", "B"),N, replace = T)) %>%
mutate(id = row_number())
现在不可行,因为数据中的团队数量少于 3。
您可以检查它现在是否有效:
sort(unique(df$team))[result$solution[1027:1052]==1]
[1] "B" "E" "I" "J" "N" "P" "Q" "X"
sort(unique(roster$team))
[1] "B" "E" "I" "J" "N" "P" "Q" "X"