当穷举搜索花费的时间太长时，我如何找到最佳组合？

Question

免责声明：我是一个完全的业余爱好者，没有接受过正规培训，只是自学了一点 C# 和 Python

有 47 个插槽。每一项必须填写一项。这些插槽分为 8 组。项目被归类到相同的 8 组。
每个插槽只能由同一组的项目填充。
同一物品可用于填充多个槽位。

每个项目都包含一个名称、一个组和 9 个统计数据。

item ("name", "group", "stat1", "stat2", "stat3", ..., "stat9")

exampleItem (exampleName, groupA, 3, 8, 19, ..., 431)

每个插槽由一个 ID 和一个组组成。

slot1 (1, groupC)

让每个插槽都装满一个物品（遵循上述规则）。

然后将每个不同的统计数据相加。

stat1Total=item1(stat1)+item2(stat1)+item3(stat1)+...+item47(stat1)

目标是：
-让每个插槽都装满相应组的项目。（没有空位）
-让stat1Total、stat2Total和stat3Total达到一定数量（stat1Goal、stat2Goal、stat3Goal）
- 让 stat1Total、stat2Total 和 stat3Total 尽可能少地超过这个数量
-最大化所有其他statTotal，每个权重

输出满足上述目标的最佳项目组合。 (如果有2个或更多的组合同样是最好的，则输出所有组合。

我试图只搜索所有可能的组合以获得最佳输出，但总共有 2.16x10^16 种可能的组合，所以我认为这不是可行。
现在我不知道如何解决这个问题，而且我对整数规划的了解越多，我就越困惑。

为了说明问题，我将给出一个包含 3 个插槽、2 个组和 5 个项目的简化示例：

SlotA、SlotB 和 SlotC 必须分别由 5 个项目之一填充。
Slot A 和 SlotB 属于组 Group1，SlotC 属于 Group2。
这些项目被分类到相同的组中。所以我们有属于 Group1 的 Item1、Item2 和 Item3 以及属于 Group2 的 Item4 和 Item5。

也就是说SlotA和SlotB只能被Item1,Item2,Item3填充。你可以想象插槽是不同形状的孔，而物品的形状适合这些孔。你不能把一个星形的钉子塞进一个方孔里，你也不能把 Item5 塞进 SlotB，因为它放不下。

物品本身有不同的统计数据。对于这个例子，我们假设每个项目只有它所属的组和 2 个统计数据：StatDark、StatLight。这些可以取 0 到 10 之间的整数值。

Item1(Group1, StatDark=3, StatLight=5)
Item2(Group1, StatDark=7, StatLight=1)
Item3(Group1, StatDark=2, StatLight=5)
Item4(Group2, StatDark=1, StatLight=6)
Item5(Group2, StatDark=2, StatLight=5)

此示例的目标是：
- 在遵守分组规则的同时用一个项目填充每个插槽
- 使所有选定项目的所有 StatDark 总和等于或大于 9
- 将 StatDark 降至 9 以上（每个高于 9 的 StatDark 都是无用和浪费的）
- 最大化所有选中项目的所有 StatLight 总和

在这种情况下，最佳解决方案是：
插槽 A 和插槽 B：项目 2 和项目 3
插槽 C：项目 4

这里是一个 link 到完整的 table 这个例子：https://imgur.com/MH80G3O

我希望这个例子能让问题更容易理解。

Answer 1

数学模型

我会引入一个二进制变量：

x[i,j] = 1 if item i is assigned to slot j
         0 otherwise

这是我们的第一个问题：有很多 (i,j) 组合是不允许的。智能模型会尽量不生成禁止的组合。所以我们需要实现一个稀疏变量而不是一个完全分配的变量。我们只想在 allowed(i,j) 为真时分配和使用 x[i,j]。

此外，引入一个连续变量xstat[s]来计算每个属性的总值。

一旦我们有了这个，我们就可以写下约束：

  sum( i | allowed(i,j),  x[i,j] ) = 1  for all slots j          (exactly one item in each slot)
  xstat[s] = sum( (i,j) | allowed(i,j),  x[i,j] * stat[i,s])     (calculate total stats)  
  xstat['StatDark'] >= 9

objective 是两个 objective 的加权和：

  minimize xstat['StatDark'] - 9
  maximize xstat['StatLight']

所以我们这样做：

  maximize -w1*(xstat['StatDark'] - 9) +  w2*xstat['StatLight']

对于 user-provided 权重 w1 和 w2。

这两个问题让问题变得有点复杂。此外，我们需要对数据做一些工作，使其适合table用于优化模型。

Python代码

import pandas as pd
import pulp as lp
from io import StringIO

#----------------------------------------
# raw data
#----------------------------------------

itemData = pd.read_table(StringIO("""
Item   Group  StatDark StatLight
Item1  Group1    3        5
Item2  Group1    7        1
Item3  Group1    2        5
Item4  Group2    1        6
Item5  Group2    2        5
"""),sep="\s+")

slotData = pd.read_table(StringIO("""
Slot   Group
SlotA  Group1
SlotB  Group1
SlotC  Group2
"""),sep='\s+')

# minimum total of Dark
minDark = 9

# stats
stat = ['StatDark','StatLight']

# objective weights
# we have two objectives and there are trde offs between them.
# here we use a simple weighted sum approach. These are the weights.
w = {'StatDark':0.3, 'StatLight':0.7}

#----------------------------------------
# data prep
#----------------------------------------


# join on Group
# we need to know which (Item,Slot) combinations are allowed.
merged = pd.merge(itemData[['Item','Group']],slotData,on='Group').set_index(['Item','Slot'])
 
items = itemData['Item']
slots = slotData['Slot']


# we will use the convention:
#  i : items
#  j : slots
#  s : stats

# stat values
# easy lookup of statv[(i,s)]
itemData = itemData.set_index('Item')
statv = {(i,s):itemData.loc[i,s] for i in items for s in stat}

#----------------------------------------
# MIP model
#----------------------------------------

# x[(i,j)] = 1 if item i is assigned to slot j
#            0 otherwise
# only use combinations (i,j) that are allowed
x = lp.LpVariable.dicts('x', [(i,j) for i,j in merged.index], cat='Binary') 

# xstat[stat] = total accumulated values
xstat = lp.LpVariable.dicts('xstat', [s for s in stat], cat='Continuous', lowBound = 0)

prob = lp.LpProblem("assignItemsToSlots", lp.LpMaximize)

# objective: weighted sum
#----------
# 1. minimize statDark exceeding minDark
# 2. maximize statLight
prob +=  -w['StatDark']*(xstat['StatDark'] - minDark) + w['StatLight']*xstat['StatLight']

# constraints
# -----------

# each slot much have exactly one item
# (but the same item can go to different slots)
for j in slots:
  prob += lp.lpSum([x[(i,j)] for i,jj in merged.index if jj==j]) == 1

#  minimum total value for dark 
prob += xstat['StatDark'] >= minDark 

# calculate stat totals
for s in stat:
  prob += xstat[s] == lp.lpSum([x[(i,j)]*statv[i,s] for i,j in merged.index])


#----------------------------------------
# solve problem
#----------------------------------------

prob.solve()
# to show the log use 
# solve(pulp.PULP_CBC_CMD(msg=1))

print("Status:", lp.LpStatus[prob.status])
print("Objective:",lp.value(prob.objective))

#----------------------------------------
# solution
#----------------------------------------

assigned = []
for i,j in merged.index:
  if lp.value(x[(i,j)]) > 0.5:
    assigned += [[i, j]]
assigned = pd.DataFrame(assigned, columns=['item','slot'])
print(assigned)

讨论

table merged 看起来像：

Item   Slot   Group
-----------------------
Item1   SlotA   Group1
        SlotB   Group1
Item2   SlotA   Group1
        SlotB   Group1
Item3   SlotA   Group1
        SlotB   Group1
Item4   SlotC   Group2
Item5   SlotC   Group2

Item,Slot 列为我们提供了允许的组合。

字典 statv 提供了一个方便的数据结构来访问统计贡献：

{('Item1', 'StatDark'): 3,
 ('Item1', 'StatLight'): 5,
 ('Item2', 'StatDark'): 7,
 ('Item2', 'StatLight'): 1,
 ('Item3', 'StatDark'): 2,
 ('Item3', 'StatLight'): 5,
 ('Item4', 'StatDark'): 1,
 ('Item4', 'StatLight'): 6,
 ('Item5', 'StatDark'): 2,
 ('Item5', 'StatLight'): 5}

生成的 MIP 模型如下所示：

assignItemsToSlots:
MAXIMIZE
-0.3*xstat_StatDark + 0.7*xstat_StatLight + 2.6999999999999997
SUBJECT TO
_C1: x_('Item1',_'SlotA') + x_('Item2',_'SlotA') + x_('Item3',_'SlotA') = 1

_C2: x_('Item1',_'SlotB') + x_('Item2',_'SlotB') + x_('Item3',_'SlotB') = 1

_C3: x_('Item4',_'SlotC') + x_('Item5',_'SlotC') = 1

_C4: xstat_StatDark >= 9

_C5: - 3 x_('Item1',_'SlotA') - 3 x_('Item1',_'SlotB')
 - 7 x_('Item2',_'SlotA') - 7 x_('Item2',_'SlotB') - 2 x_('Item3',_'SlotA')
 - 2 x_('Item3',_'SlotB') - x_('Item4',_'SlotC') - 2 x_('Item5',_'SlotC')
 + xstat_StatDark = 0

_C6: - 5 x_('Item1',_'SlotA') - 5 x_('Item1',_'SlotB') - x_('Item2',_'SlotA')
 - x_('Item2',_'SlotB') - 5 x_('Item3',_'SlotA') - 5 x_('Item3',_'SlotB')
 - 6 x_('Item4',_'SlotC') - 5 x_('Item5',_'SlotC') + xstat_StatLight = 0

VARIABLES
0 <= x_('Item1',_'SlotA') <= 1 Integer
0 <= x_('Item1',_'SlotB') <= 1 Integer
0 <= x_('Item2',_'SlotA') <= 1 Integer
0 <= x_('Item2',_'SlotB') <= 1 Integer
0 <= x_('Item3',_'SlotA') <= 1 Integer
0 <= x_('Item3',_'SlotB') <= 1 Integer
0 <= x_('Item4',_'SlotC') <= 1 Integer
0 <= x_('Item5',_'SlotC') <= 1 Integer
xstat_StatDark Continuous
xstat_StatLight Continuous

解决方案

解决方案如下：

Status: Optimal
Objective: 8.099999999999998
    item   slot
0  Item2  SlotB
1  Item3  SlotA
2  Item4  SlotC

当穷举搜索花费的时间太长时，我如何找到最佳组合？

How do I find an optimal combination when an exhaustive search would take way too long?

python

c#

algorithm

mathematical-optimization

linear-programming

数学模型

Python代码

讨论

解决方案