为已解决的 PuLP 优化问题接收空 DataFrame

Question

我第一次涉足 PuLP（和 Python，一般来说）运行梦幻足球游戏的优化问题。

我下面的代码运行s 成功了，但是它输出了一个空的 DataFrame。

import pandas as pd
import pulp

print('--- (1/4) Defining the problem ---')

# Read csv
raw_data = pd.read_csv('./csv/fantasypros.csv')

# create new columns that has binary numbers if player == a specific position
raw_data["QB"] = (raw_data["Pos. Parent"] == "QB").astype(float)
raw_data["RB"] = (raw_data["Pos. Parent"] == "RB").astype(float)
raw_data["WR"] = (raw_data["Pos. Parent"] == "WR").astype(float)
raw_data["TE"] = (raw_data["Pos. Parent"] == "TE").astype(float)
raw_data["K"] = (raw_data["Pos. Parent"] == "K").astype(float)
raw_data["DST"] = (raw_data["Pos. Parent"] == "DEF").astype(float)
raw_data["DK"] = (raw_data["Pos. Parent"] == "DK").astype(float)
raw_data["salary"] = raw_data["Point Cost"].astype(float)

model = pulp.LpProblem("NFTdraft", pulp.LpMaximize)

total_points = {}
cost = {}
qb = {}
rb = {}
wr = {}
te = {}
k = {}
dst = {}
dk = {}
num_players = {}

vars = []

# i = row index, player = player attributes
for i, player in raw_data.iterrows():
    var_name = 'x' + str(i)  # Create variable name
    decision_var = pulp.LpVariable(var_name, cat='Binary')  # Initialize Variables

    vars.append(decision_var)

    total_points[decision_var] = player["FPTS"]  # Create FPTS Dictionary
    cost[decision_var] = player["salary"]  # Create Cost Dictionary

    # Create Dictionary for Player Types
    qb[decision_var] = player["QB"]
    rb[decision_var] = player["RB"]
    wr[decision_var] = player["WR"]
    te[decision_var] = player["TE"]
    k[decision_var] = player["K"]
    dst[decision_var] = player["DST"]
    dk[decision_var] = player["DK"]
    num_players[decision_var] = 1.0

objective_function = pulp.LpAffineExpression(total_points)
model += objective_function

total_cost = pulp.LpAffineExpression(cost)
model += (total_cost <= 135)

print('--- (2/4) Defining the constraints ---')
QB_constraint = pulp.LpAffineExpression(qb)
RB_constraint = pulp.LpAffineExpression(rb)
WR_constraint = pulp.LpAffineExpression(wr)
TE_constraint = pulp.LpAffineExpression(te)
K_constraint = pulp.LpAffineExpression(k)
DST_constraint = pulp.LpAffineExpression(dst)
DK_constraint = pulp.LpAffineExpression(dk)
total_players = pulp.LpAffineExpression(num_players)

model += (QB_constraint >= 1)
model += (QB_constraint <= 2)
model += (RB_constraint <= 8)
model += (WR_constraint <= 8)
model += (TE_constraint <= 8)
model += (K_constraint <= 1)
model += (DST_constraint <= 1)
model += (DK_constraint <= 2)
model += (total_players == 10)

print('--- (3/4) Solving the problem ---')
model.solve()

print('--- (4/4) Formatting the results ---')
raw_data["is_drafted"] = 0.0

for var in model.variables():
    raw_data.iloc[int(var.name[1:]), 10] = var.varValue

my_team = raw_data[raw_data["is_drafted"] == 1.0]
my_team = my_team[["Asset Name", "Player", "Pos. Parent", "Rarity", "Point Cost", "FPTS"]]

print(my_team)
print("Total used amount of salary cap: {}".format(my_team["Point Cost"].sum()))
print("Projected points: {}".format(my_team["FPTS"].sum().round(1)))
print('--- Completed ---')

预期结果是模型建议在给定限制的情况下提供最预期结果的十名球员的阵容。

我不确定这是否有帮助，但下面是我解决问题并尝试格式化结果时 Python 控制台中的输出。

At line 2 NAME          MODEL
At line 3 ROWS
At line 15 COLUMNS
At line 35896 RHS
At line 35907 BOUNDS
At line 38668 ENDATA
Problem MODEL has 10 rows, 2760 columns and 8324 elements
Coin0008I MODEL read with 0 errors
Continuous objective value is 193.829 - 0.01 seconds
Cgl0003I 2 fixed, 0 tightened bounds, 0 strengthened rows, 0 substitutions
Cgl0003I 2 fixed, 6 tightened bounds, 0 strengthened rows, 0 substitutions
Cgl0003I 0 fixed, 1 tightened bounds, 0 strengthened rows, 0 substitutions
Cgl0004I processed model has 7 rows, 266 columns (266 integer (58 of which binary)) and 773 elements
Cutoff increment increased from 1e-05 to 0.000999
Cbc0012I Integer solution of -192.1 found by DiveCoefficient after 0 iterations and 0 nodes (0.05 seconds)
Cbc0038I Full problem 7 rows 266 columns, reduced to 2 rows 3 columns
Cbc0012I Integer solution of -192.574 found by DiveCoefficient after 10 iterations and 0 nodes (0.08 seconds)
Cbc0031I 2 added rows had average density of 7.5
Cbc0013I At root node, 2 cuts changed objective from -193.82941 to -192.574 in 4 passes
Cbc0014I Cut generator 0 (Probing) - 0 row cuts average 0.0 elements, 3 column cuts (3 active)  in 0.003 seconds - new frequency is 1
Cbc0014I Cut generator 1 (Gomory) - 6 row cuts average 9.0 elements, 0 column cuts (0 active)  in 0.003 seconds - new frequency is 1
Cbc0014I Cut generator 2 (Knapsack) - 0 row cuts average 0.0 elements, 0 column cuts (0 active)  in 0.002 seconds - new frequency is -100
Cbc0014I Cut generator 3 (Clique) - 0 row cuts average 0.0 elements, 0 column cuts (0 active)  in 0.000 seconds - new frequency is -100
Cbc0014I Cut generator 4 (MixedIntegerRounding2) - 1 row cuts average 7.0 elements, 0 column cuts (0 active)  in 0.000 seconds - new frequency is 1
Cbc0014I Cut generator 5 (FlowCover) - 0 row cuts average 0.0 elements, 0 column cuts (0 active)  in 0.001 seconds - new frequency is -100
Cbc0014I Cut generator 6 (TwoMirCuts) - 5 row cuts average 8.0 elements, 0 column cuts (0 active)  in 0.000 seconds - new frequency is -100
Cbc0001I Search completed - best objective -192.574, took 10 iterations and 0 nodes (0.08 seconds)
Cbc0035I Maximum depth 0, 60 variables fixed on reduced cost
Cuts at root node changed objective from -193.829 to -192.574
Probing was tried 4 times and created 3 cuts of which 0 were active after adding rounds of cuts (0.003 seconds)
Gomory was tried 4 times and created 6 cuts of which 0 were active after adding rounds of cuts (0.003 seconds)
Knapsack was tried 4 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.002 seconds)
Clique was tried 4 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
MixedIntegerRounding2 was tried 4 times and created 1 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
FlowCover was tried 4 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.001 seconds)
TwoMirCuts was tried 4 times and created 5 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Result - Optimal solution found
Objective value:                192.57400000
Enumerated nodes:               0
Total iterations:               10
Time (CPU seconds):             0.08
Time (Wallclock seconds):       0.12
Option for printingOptions changed from normal to all
Total time (CPU seconds):       0.10   (Wallclock seconds):       0.14
--- (4/4) Formatting the results ---
Empty DataFrame
Columns: [Asset Name, Player, Pos. Parent, Rarity, Point Cost, FPTS]
Index: []
Total used amount of salary cap: 0
Projected points: 0.0
--- Completed ---

提前感谢您就如何让我的最佳 10 人阵容填充 DataFrame 提供任何建议。

编辑 - 根据@chitown88 的要求，这里是 link 到 CSV。

Answer 1

首先，vars 是一个内置函数，不要将其用作变量。

其次，您可以通过简单地使用 panda 的 .get_dummies().

来简化对二进制位置的单热编码

最后，它没有将 1.0 值分配给您的 "is_drafted" 列。尝试 .loc 而不是 .iloc。我也会使用列名而不是 "is_drafted" 的索引位置，但这只是我的偏好。

试一试。我评论了我所做的更改。如果没有你的具体数据，我无法真正测试它。因此，如果它不起作用，您可能需要共享您的 csv 文件以便我进行调试：

import pandas as pd
import pulp

print('--- (1/4) Defining the problem ---')

# Read csv
raw_data = pd.read_csv('./csv/fantasypros.csv')

# create new columns that has binary numbers if player == a specific position
encoded = pd.get_dummies(raw_data['Pos. Parent']) #<-- One-Hote Encoding 
raw_data = raw_data.join(encoded) #<-- joining it to the raw_data table

raw_data["salary"] = raw_data["Point Cost"].astype(float)

model = pulp.LpProblem("NFTdraft", pulp.LpMaximize)

total_points = {}
cost = {}
qb = {}
rb = {}
wr = {}
te = {}
k = {}
dst = {}
dk = {}
num_players = {}

vars_list = []

# i = row index, player = player attributes
for i, player in raw_data.iterrows():
    var_name = 'x' + str(i)  # Create variable name
    decision_var = pulp.LpVariable(var_name, cat='Binary')  # Initialize Variables

    total_points[decision_var] = player["FPTS"]  # Create FPTS Dictionary
    cost[decision_var] = player["salary"]  # Create Cost Dictionary

    # Create Dictionary for Player Types
    qb[decision_var] = player["QB"]
    rb[decision_var] = player["RB"]
    wr[decision_var] = player["WR"]
    te[decision_var] = player["TE"]
    k[decision_var] = player["K"]
    dst[decision_var] = player["DST"]
    dk[decision_var] = player["DK"]
    num_players[decision_var] = 1.0

objective_function = pulp.LpAffineExpression(total_points)
model += objective_function

total_cost = pulp.LpAffineExpression(cost)
model += (total_cost <= 135)

print('--- (2/4) Defining the constraints ---')
QB_constraint = pulp.LpAffineExpression(qb)
RB_constraint = pulp.LpAffineExpression(rb)
WR_constraint = pulp.LpAffineExpression(wr)
TE_constraint = pulp.LpAffineExpression(te)
K_constraint = pulp.LpAffineExpression(k)
DST_constraint = pulp.LpAffineExpression(dst)
DK_constraint = pulp.LpAffineExpression(dk)
total_players = pulp.LpAffineExpression(num_players)

model += (QB_constraint >= 1)
model += (QB_constraint <= 2)
model += (RB_constraint <= 8)
model += (WR_constraint <= 8)
model += (TE_constraint <= 8)
model += (K_constraint <= 1)
model += (DST_constraint <= 1)
model += (DK_constraint <= 2)
model += (total_players == 10)

print('--- (3/4) Solving the problem ---')
model.solve()

print('--- (4/4) Formatting the results ---')
raw_data["is_drafted"] = 0.0

for var in model.variables():
    raw_data.loc[int(var.name[1:]), 'is_drafted'] = var.varValue     # <--- CHANGED HERE
    
my_team = raw_data[raw_data["is_drafted"] == 1.0]
my_team = my_team[["Asset Name", "Player", "Pos. Parent", "Rarity", "Point Cost", "FPTS"]]

print(my_team)
print("Total used amount of salary cap: {}".format(my_team["Point Cost"].sum()))
print("Projected points: {}".format(my_team["FPTS"].sum().round(1)))
print('--- Completed ---')

为已解决的 PuLP 优化问题接收空 DataFrame

Receiving Empty DataFrame for Solved PuLP Optimization Problem

python

optimization

dataframe

pulp