涉及矩阵运算和约束的优化帮助
Optimization help involving matrix operations and constraints
我在这方面还远远不够,所以我希望有人能给我指出正确的方向。我认为这是一个优化问题,但我对 scipy.optimize 以及它如何与纸浆相适应感到困惑。此外,矩阵数学让我感到困惑。因此,这个问题真的让我放慢了速度而不去问。
问题陈述: 我有一个客户数据集。对于每个客户,我可以选择 3 个选项,或者选择 none。所以4个选项。此外,对于每个客户,我都有一个数字分数,表示每个选择的 "good" 程度。您可以将此值想象为 the probability of the Choice to create a future sale
.
# fake data for the internet
data = {'customerid':[101,102,103,104,105,106,107,108,109,110],
'prob_CHOICEA':[0.00317,0.00629,0.00242,0.00253,0.00421,0.00414,0.00739,0.00549,0.00658,0.00852],
'prob_CHOICEB':[0.061,0.087,0.055,0.027,0.022,0.094,0.099,0.072,0.018,0.052],
'prob_CHOICEC':[0.024,0.013,0.091,0.047,0.071,0.077,0.067,0.046,0.077,0.044]
}
# Creates pandas DataFrame
df = pd.DataFrame(data)
df = df.reset_index(drop=True).set_index(['customerid'])
+------------+--------------+--------------+--------------+
| customerid | prob_CHOICEA | prob_CHOICEB | prob_CHOICEC |
+------------+--------------+--------------+--------------+
| 101 | 0.00317 | 0.061 | 0.024 |
| 102 | 0.00629 | 0.087 | 0.013 |
| 103 | 0.00242 | 0.055 | 0.091 |
| 104 | 0.00253 | 0.027 | 0.047 |
| 105 | 0.00421 | 0.022 | 0.071 |
| 106 | 0.00414 | 0.094 | 0.077 |
| 107 | 0.00739 | 0.099 | 0.067 |
| 108 | 0.00549 | 0.072 | 0.046 |
| 109 | 0.00658 | 0.018 | 0.077 |
| 110 | 0.00852 | 0.052 | 0.044 |
+------------+--------------+--------------+--------------+
我首先将这些元素组合成每个客户的单个数组:
# combine all values into 1 array
list_to_combine = ['prob_CHOICEA', 'prob_CHOICEB','prob_CHOICEC']
df['probs_A_B_C']= df[list_to_combine].values.tolist()
df.drop(list_to_combine, axis=1, inplace=True)
+------------+-------------------------+
| customerid | probs_A_B_C |
+------------+-------------------------+
| 101 | [0.00317, 0.061, 0.024] |
| 102 | [0.00629, 0.087, 0.013] |
| 103 | [0.00242, 0.055, 0.091] |
| 104 | [0.00253, 0.027, 0.047] |
| 105 | [0.00421, 0.022, 0.071] |
| 106 | [0.00414, 0.094, 0.077] |
| 107 | [0.00739, 0.099, 0.067] |
| 108 | [0.00549, 0.072, 0.046] |
| 109 | [0.00658, 0.018, 0.077] |
| 110 | [0.00852, 0.052, 0.044] |
+------------+-------------------------+
对于每个客户,我只有 4 个选择:
choices = [
[0,0,0],
[1,0,0],
[0,1,0],
[0,0,1]
]
对于每个客户,我想为每个客户选择最好的选择。乍一看这很简单——只需选择最大的数字即可。然而,一旦我开始添加约束,它就开始让我大吃一惊。
例如,如果我想为每个客户选择最佳选择,但限制条件是所选选择的总和 = 5
+------------+-------------------------+-------------+
| customerid | probs_A_B_C | best_choice |
+------------+-------------------------+-------------+
| 101 | [0.00317, 0.061, 0.024] | [0,0,0] |
| 102 | [0.00629, 0.087, 0.013] | [0,1,0] |
| 103 | [0.00242, 0.055, 0.091] | [0,0,1] |
| 104 | [0.00253, 0.027, 0.047] | [0,0,0] |
| 105 | [0.00421, 0.022, 0.071] | [0,0,0] |
| 106 | [0.00414, 0.094, 0.077] | [0,1,0] |
| 107 | [0.00739, 0.099, 0.067] | [0,1,0] |
| 108 | [0.00549, 0.072, 0.046] | [0,0,0] |
| 109 | [0.00658, 0.018, 0.077] | [0,0,1] |
| 110 | [0.00852, 0.052, 0.044] | [0,0,0] |
+------------+-------------------------+-------------+
我什至没有弄清楚如何做到这一点,我只是为了说明目的手动观察它。
理想情况下,我想同时添加多个约束:
- 总和 best_choice = N
- CHOICEA的总和(best_choice的第一个元素)>= M
- CHOICEB总和(best_choice的第二个元素)<=10
知道从哪里开始吗?
您可以使用scipy.optimize.linprog
来解决这个线性优化问题。它需要将边界条件设置为矩阵乘积,如文档中所述。有两种类型的边界条件,A @ x <= b
形式的不等式和 A @ x == b
形式的不等式。该问题可以建模如下:
- 生成的向量
x
的长度为 N*C
,其中 N
是客户数量,C
是选项数量;它代表线性布局中每个自定义的选择:[c1_A, c1_B, c1_C, c2_A, c2_B, c2_C, ..., cN_A, cN_B, cN_C]
.
- 由于每个客户最多可以做出一个选择,因此我们对每个客户都有一个不等式,它将所有相应的选择相加,即一个矩阵,其中行代表客户,列代表所有选择。如果选择对应于客户,则矩阵有条目
1
,否则为零(插图见下图)。
- 选项A至少要选M次;因为我们只有
A @ x <= b
形式的不等式,所以我们可以反转值并使用 A
中对应于选项 A 和 b
中的 -M
的条目 -1
。
- 选项B必须选择不超过10次;这可以通过使用
1
和正 10
的条目(因为它已经是 <=
)的形式来建模,类似于之前的约束。
- 所有选项的总和必须是
N
。这可以通过等式约束来建模,其中矩阵对 x
中的所有选择求和,结果必须等于 N
.
这是对上述约束的说明:
# Max. one choice per customer.
# A = # b =
[[1, 1, 1, 0, 0, 0, ..., 0, 0, 0], [1,
[0, 0, 0, 1, 1, 1, ..., 0, 0, 0], 1,
... ...
[0, 0, 0, 0, 0, 0, ..., 1, 1, 1]] 1]
# Min. M choices for option A.
# A = # b =
[[-1, 0, 0, -1, 0, 0, ..., -1, 0, 0]] [[-M]]
# Max. 10 choices for option B.
# A = # b =
[[0, 1, 0, 0, 1, 0, ..., 0, 1, 0]] [[10]]
# Total number of choices equals N.
# A = # b =
[[1, 1, 1, 1, 1, 1, ..., 1, 1, 1]] [[N]]
下面是一些设置约束和运行优化的示例代码:
import numpy as np
import pandas as pd
from scipy.optimize import linprog
data = {'customerid':[101,102,103,104,105,106,107,108,109,110],
'prob_CHOICEA':[0.00317,0.00629,0.00242,0.00253,0.00421,0.00414,0.00739,0.00549,0.00658,0.00852],
'prob_CHOICEB':[0.061,0.087,0.055,0.027,0.022,0.094,0.099,0.072,0.018,0.052],
'prob_CHOICEC':[0.024,0.013,0.091,0.047,0.071,0.077,0.067,0.046,0.077,0.044]
}
# Creates pandas DataFrame
df = pd.DataFrame(data)
df = df.reset_index(drop=True).set_index(['customerid'])
print(df, end='\n\n')
nc = df.shape[1] # number of options
data = df.to_numpy().ravel()
# Max. choices per customer is 1.
A_ub_1 = np.zeros((len(df), len(data)))
for i in range(len(A_ub_1)):
A_ub_1[i, nc*i:nc*(i+1)] = 1
b_ub_1 = np.ones(len(df))
# Min. choices for option A is 3.
A_ub_2 = np.zeros((1, len(data)))
A_ub_2[0, ::nc] = -1 # invert, since this defines an upper boundary
b_ub_2 = np.array([-3])
# Max. choices for option B is 2.
A_ub_3 = np.zeros((1, len(data)))
A_ub_3[0, 1::nc] = 1
b_ub_3 = np.array([2])
# Total sum of choices is 7.
A_eq = np.ones((1, len(data)))
b_eq = np.array([7])
result = linprog(
-1 * data, # linprog aims to minimize the value
A_eq=A_eq, b_eq=b_eq,
A_ub=np.concatenate((A_ub_1, A_ub_2, A_ub_3), axis=0),
b_ub=np.concatenate((b_ub_1, b_ub_2, b_ub_3), axis=0),
bounds=(0, 1)
)
print(result, end='\n\n')
choices = (result.x.reshape(-1, 3) > 1e-6).astype(int)
print('Choices:', choices, sep='\n')
它产生以下结果:
prob_CHOICEA prob_CHOICEB prob_CHOICEC
customerid
101 0.00317 0.061 0.024
102 0.00629 0.087 0.013
103 0.00242 0.055 0.091
104 0.00253 0.027 0.047
105 0.00421 0.022 0.071
106 0.00414 0.094 0.077
107 0.00739 0.099 0.067
108 0.00549 0.072 0.046
109 0.00658 0.018 0.077
110 0.00852 0.052 0.044
con: array([-1.30002675e-11])
fun: -0.3812999999903971
message: 'Optimization terminated successfully.'
nit: 7
slack: array([1.00000000e+00, 7.99305067e-11, 1.47325485e-11, 1.00000000e+00,
1.00000000e+00, 2.49527066e-11, 2.42738052e-11, 5.84235438e-10,
4.23596713e-11, 5.77714543e-11, 8.80984175e-12, 1.46305190e-11])
status: 0
success: True
x: array([2.89971936e-10, 1.32732722e-11, 6.97732845e-12, 1.00000000e+00,
3.28055311e-10, 5.72702383e-12, 1.80418885e-11, 4.61391860e-12,
1.00000000e+00, 2.01674011e-10, 4.58311340e-12, 1.29599793e-11,
2.95298295e-10, 4.34109315e-12, 1.21776975e-11, 3.39951283e-11,
1.00000000e+00, 2.55262044e-10, 4.94703751e-11, 1.00000000e+00,
1.57932544e-11, 9.99999999e-01, 2.21487598e-11, 1.33679145e-11,
2.30514296e-10, 3.91129933e-12, 1.00000000e+00, 1.00000000e+00,
8.19015577e-12, 1.07293976e-11])
Choices:
[[0 0 0]
[1 0 0]
[0 0 1]
[0 0 0]
[0 0 0]
[0 1 0]
[0 1 0]
[1 0 0]
[0 0 1]
[1 0 0]]
这个问题可以用线性规划 (LP) 来解决,但最困难的部分是不知道你应该使用 LP,它是将你的问题转化为 LP-optimization
问题,我会告诉你如何做到这一点。在继续之前,为了简化目的,我将更改您提供的示例数据(因为生成的变量数量巨大),因此,假设我们有以下输入数据:
+------------+---------------------+
| customerid | prob A | prob B |
+------------+---------------------+
| 101 | 0.00317 | 0.061 |
| 102 | 0.00629 | 0.087 |
+------------+---------------------+
假设输入问题的规模为N,其中N代表选择数:
+---------------------+
| prob A | prob B |
+---------------------+
| 0.00317 | 0.061 |
| 0.00629 | 0.087 |
+------------+--------+
由于我们有4种不同的选择,N=4
(其中一些相互排斥并不重要,这样的特征将被约束映射)。在LP中我们有以下事情要处理:
- 一个objective函数C(维度
1x[at least N]
,它是一个行数组),
- 约束矩阵
A
(维度取决于您要添加多少限制,您的限制也可能比变量更多)和
- 右 边(我们称之为
b
,它的维度是 [number of rows in A]x1
,它是一个 列数组)。
因此,LP 最大化问题将具有以下格式:
Max Cx
subject to:
Ax <= b
x >= 0
请注意,从现在开始,我们将创建 LP 变量来表示我们拥有的输入数据,因此假设 xi
和 input data
之间的映射如下:
+-------------------------------+
| LP variable | Customer | prob |
+-------------------------------+
| x0 | 101 | A |
| x1 | 101 | B |
| x2 | 102 | A |
| x3 | 102 | B |
+-------------------------------+
让我们从并行填充我们的约束矩阵 A
和 RHS b
开始,我们应该创建一个由 two NxN
identity matrices 的列串联形成的矩阵]:
A
+-----------------------------------------------------+
| x0 | x1 | x2 | x3 | x4 | x5 | x6 | x7 | b
+-----------------------------------------------------+ +-----+
row 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | | 1 |
row 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | | 1 |
row 2 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | | 1 |
row 3 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | | 1 |
+-----------------------------------------------------+ +-----+
我们还需要确保每个客户最多选择一个变量(我们输入数据的行),因此我们还为每个客户创建一个额外的变量,在本例中,x8
和 x9
,并在我们将在 A 上创建的相应新 2
行上将它们设置为 1
。此外,新行在映射到每个客户的变量中也必须有 1(简单地看一下所需客户中存在变量的位置)。所以我们将以下 2
行添加到矩阵 A
和列数组 b
:
A
+------------------------------------------------------------------+
| x0 | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | b
+------------------------------------------------------------------+ +-----+
row 4 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | | 1 |
row 5 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | | 1 |
+------------------------------------------------------------------+ +-----+
现在 A 变成:
A
+------------------------------------------------------------------+
| x0 | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | b
+------------------------------------------------------------------+ +-----+
row 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | | 1 |
row 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | | 1 |
row 2 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | | 1 |
row 3 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | | 1 |
row 4 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | | 1 |
row 5 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | | 1 |
+------------------------------------------------------------------+ +-----+
假设我们还想添加一个约束,保证总共最多选择2
个概率,那么我们将第6行第x10
列添加到A中,设置为1个变量从 x0 到 x3 以及 x10
:
A
+------------------------------------------------------------------------+
| x0 | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 | b
+------------------------------------------------------------------------+ +-----+
row 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | | 1 |
row 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | | 1 |
row 2 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | | 1 |
row 3 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | | 1 |
row 4 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | | 1 |
row 5 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | | 1 |
row 6 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | | 2 |
+------------------------------------------------------------------------+ +-----+
请注意,在这个简单的示例中,将选项数量限制为最多 2 个不会影响最终结果。
最后我们构建 objective 函数:
C
+-----------------------------------------------------------------------------------+
| x0 | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 |
+-----------------------------------------------------------------------------------+
| 0.00317 | 0.061 | 0.00629 | 0.087 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+-----------------------------------------------------------------------------------+
创建但没有映射到客户输入数据的变量称为松弛变量,它们的目的是正确构建 LP 问题背后的数学。
既然您知道如何将您的问题建模为 LP 优化问题,您必须选择一种方法来解决问题。我推荐单纯形算法,您可以在 scipy.
找到它
在 运行 您的首选求解器之后,您必须解释输出结果。结果应该是包含每个 xi 值的单行数组。我给出的上述示例的输出为:
+------------------------------------------------------------------+
| x0 | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 |
+------------------------------------------------------------------+
| 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+------------------------------------------------------------------+
上面的结果意味着你应该选择由变量x1和x3表示的元素,因为它们被设置为1,i。例如,客户 101 选择了概率 B,客户 102 也选择了概率 B。
Post 脚本:
- 使用
scipy.optimze.linprog
库完成工作后,如果使用上述建模,请确保使用参数 "Aeq" 而不是 "Aub" 作为约束;
- 我没有深入研究这个特定问题背后的数学来证明这一点,但是,由于可以从这个问题构建的约束的性质,似乎整数 LP 永远不是必须的;
- objective 函数 C 的系数可以取任何实数,包括负数和 0;和
- 我推荐了 scipy 的 LP 工具,因为我以前使用过它并且效果很好,尽管如此,还有其他免费的很棒的实现可用,比如
glpk
,它可能提供更高级的功能解决您的问题的任何进一步需求的工具。
我在这方面还远远不够,所以我希望有人能给我指出正确的方向。我认为这是一个优化问题,但我对 scipy.optimize 以及它如何与纸浆相适应感到困惑。此外,矩阵数学让我感到困惑。因此,这个问题真的让我放慢了速度而不去问。
问题陈述: 我有一个客户数据集。对于每个客户,我可以选择 3 个选项,或者选择 none。所以4个选项。此外,对于每个客户,我都有一个数字分数,表示每个选择的 "good" 程度。您可以将此值想象为 the probability of the Choice to create a future sale
.
# fake data for the internet
data = {'customerid':[101,102,103,104,105,106,107,108,109,110],
'prob_CHOICEA':[0.00317,0.00629,0.00242,0.00253,0.00421,0.00414,0.00739,0.00549,0.00658,0.00852],
'prob_CHOICEB':[0.061,0.087,0.055,0.027,0.022,0.094,0.099,0.072,0.018,0.052],
'prob_CHOICEC':[0.024,0.013,0.091,0.047,0.071,0.077,0.067,0.046,0.077,0.044]
}
# Creates pandas DataFrame
df = pd.DataFrame(data)
df = df.reset_index(drop=True).set_index(['customerid'])
+------------+--------------+--------------+--------------+
| customerid | prob_CHOICEA | prob_CHOICEB | prob_CHOICEC |
+------------+--------------+--------------+--------------+
| 101 | 0.00317 | 0.061 | 0.024 |
| 102 | 0.00629 | 0.087 | 0.013 |
| 103 | 0.00242 | 0.055 | 0.091 |
| 104 | 0.00253 | 0.027 | 0.047 |
| 105 | 0.00421 | 0.022 | 0.071 |
| 106 | 0.00414 | 0.094 | 0.077 |
| 107 | 0.00739 | 0.099 | 0.067 |
| 108 | 0.00549 | 0.072 | 0.046 |
| 109 | 0.00658 | 0.018 | 0.077 |
| 110 | 0.00852 | 0.052 | 0.044 |
+------------+--------------+--------------+--------------+
我首先将这些元素组合成每个客户的单个数组:
# combine all values into 1 array
list_to_combine = ['prob_CHOICEA', 'prob_CHOICEB','prob_CHOICEC']
df['probs_A_B_C']= df[list_to_combine].values.tolist()
df.drop(list_to_combine, axis=1, inplace=True)
+------------+-------------------------+
| customerid | probs_A_B_C |
+------------+-------------------------+
| 101 | [0.00317, 0.061, 0.024] |
| 102 | [0.00629, 0.087, 0.013] |
| 103 | [0.00242, 0.055, 0.091] |
| 104 | [0.00253, 0.027, 0.047] |
| 105 | [0.00421, 0.022, 0.071] |
| 106 | [0.00414, 0.094, 0.077] |
| 107 | [0.00739, 0.099, 0.067] |
| 108 | [0.00549, 0.072, 0.046] |
| 109 | [0.00658, 0.018, 0.077] |
| 110 | [0.00852, 0.052, 0.044] |
+------------+-------------------------+
对于每个客户,我只有 4 个选择:
choices = [
[0,0,0],
[1,0,0],
[0,1,0],
[0,0,1]
]
对于每个客户,我想为每个客户选择最好的选择。乍一看这很简单——只需选择最大的数字即可。然而,一旦我开始添加约束,它就开始让我大吃一惊。
例如,如果我想为每个客户选择最佳选择,但限制条件是所选选择的总和 = 5
+------------+-------------------------+-------------+
| customerid | probs_A_B_C | best_choice |
+------------+-------------------------+-------------+
| 101 | [0.00317, 0.061, 0.024] | [0,0,0] |
| 102 | [0.00629, 0.087, 0.013] | [0,1,0] |
| 103 | [0.00242, 0.055, 0.091] | [0,0,1] |
| 104 | [0.00253, 0.027, 0.047] | [0,0,0] |
| 105 | [0.00421, 0.022, 0.071] | [0,0,0] |
| 106 | [0.00414, 0.094, 0.077] | [0,1,0] |
| 107 | [0.00739, 0.099, 0.067] | [0,1,0] |
| 108 | [0.00549, 0.072, 0.046] | [0,0,0] |
| 109 | [0.00658, 0.018, 0.077] | [0,0,1] |
| 110 | [0.00852, 0.052, 0.044] | [0,0,0] |
+------------+-------------------------+-------------+
我什至没有弄清楚如何做到这一点,我只是为了说明目的手动观察它。
理想情况下,我想同时添加多个约束:
- 总和 best_choice = N
- CHOICEA的总和(best_choice的第一个元素)>= M
- CHOICEB总和(best_choice的第二个元素)<=10
知道从哪里开始吗?
您可以使用scipy.optimize.linprog
来解决这个线性优化问题。它需要将边界条件设置为矩阵乘积,如文档中所述。有两种类型的边界条件,A @ x <= b
形式的不等式和 A @ x == b
形式的不等式。该问题可以建模如下:
- 生成的向量
x
的长度为N*C
,其中N
是客户数量,C
是选项数量;它代表线性布局中每个自定义的选择:[c1_A, c1_B, c1_C, c2_A, c2_B, c2_C, ..., cN_A, cN_B, cN_C]
. - 由于每个客户最多可以做出一个选择,因此我们对每个客户都有一个不等式,它将所有相应的选择相加,即一个矩阵,其中行代表客户,列代表所有选择。如果选择对应于客户,则矩阵有条目
1
,否则为零(插图见下图)。 - 选项A至少要选M次;因为我们只有
A @ x <= b
形式的不等式,所以我们可以反转值并使用A
中对应于选项 A 和b
中的-M
的条目-1
。 - 选项B必须选择不超过10次;这可以通过使用
1
和正10
的条目(因为它已经是<=
)的形式来建模,类似于之前的约束。 - 所有选项的总和必须是
N
。这可以通过等式约束来建模,其中矩阵对x
中的所有选择求和,结果必须等于N
.
这是对上述约束的说明:
# Max. one choice per customer.
# A = # b =
[[1, 1, 1, 0, 0, 0, ..., 0, 0, 0], [1,
[0, 0, 0, 1, 1, 1, ..., 0, 0, 0], 1,
... ...
[0, 0, 0, 0, 0, 0, ..., 1, 1, 1]] 1]
# Min. M choices for option A.
# A = # b =
[[-1, 0, 0, -1, 0, 0, ..., -1, 0, 0]] [[-M]]
# Max. 10 choices for option B.
# A = # b =
[[0, 1, 0, 0, 1, 0, ..., 0, 1, 0]] [[10]]
# Total number of choices equals N.
# A = # b =
[[1, 1, 1, 1, 1, 1, ..., 1, 1, 1]] [[N]]
下面是一些设置约束和运行优化的示例代码:
import numpy as np
import pandas as pd
from scipy.optimize import linprog
data = {'customerid':[101,102,103,104,105,106,107,108,109,110],
'prob_CHOICEA':[0.00317,0.00629,0.00242,0.00253,0.00421,0.00414,0.00739,0.00549,0.00658,0.00852],
'prob_CHOICEB':[0.061,0.087,0.055,0.027,0.022,0.094,0.099,0.072,0.018,0.052],
'prob_CHOICEC':[0.024,0.013,0.091,0.047,0.071,0.077,0.067,0.046,0.077,0.044]
}
# Creates pandas DataFrame
df = pd.DataFrame(data)
df = df.reset_index(drop=True).set_index(['customerid'])
print(df, end='\n\n')
nc = df.shape[1] # number of options
data = df.to_numpy().ravel()
# Max. choices per customer is 1.
A_ub_1 = np.zeros((len(df), len(data)))
for i in range(len(A_ub_1)):
A_ub_1[i, nc*i:nc*(i+1)] = 1
b_ub_1 = np.ones(len(df))
# Min. choices for option A is 3.
A_ub_2 = np.zeros((1, len(data)))
A_ub_2[0, ::nc] = -1 # invert, since this defines an upper boundary
b_ub_2 = np.array([-3])
# Max. choices for option B is 2.
A_ub_3 = np.zeros((1, len(data)))
A_ub_3[0, 1::nc] = 1
b_ub_3 = np.array([2])
# Total sum of choices is 7.
A_eq = np.ones((1, len(data)))
b_eq = np.array([7])
result = linprog(
-1 * data, # linprog aims to minimize the value
A_eq=A_eq, b_eq=b_eq,
A_ub=np.concatenate((A_ub_1, A_ub_2, A_ub_3), axis=0),
b_ub=np.concatenate((b_ub_1, b_ub_2, b_ub_3), axis=0),
bounds=(0, 1)
)
print(result, end='\n\n')
choices = (result.x.reshape(-1, 3) > 1e-6).astype(int)
print('Choices:', choices, sep='\n')
它产生以下结果:
prob_CHOICEA prob_CHOICEB prob_CHOICEC
customerid
101 0.00317 0.061 0.024
102 0.00629 0.087 0.013
103 0.00242 0.055 0.091
104 0.00253 0.027 0.047
105 0.00421 0.022 0.071
106 0.00414 0.094 0.077
107 0.00739 0.099 0.067
108 0.00549 0.072 0.046
109 0.00658 0.018 0.077
110 0.00852 0.052 0.044
con: array([-1.30002675e-11])
fun: -0.3812999999903971
message: 'Optimization terminated successfully.'
nit: 7
slack: array([1.00000000e+00, 7.99305067e-11, 1.47325485e-11, 1.00000000e+00,
1.00000000e+00, 2.49527066e-11, 2.42738052e-11, 5.84235438e-10,
4.23596713e-11, 5.77714543e-11, 8.80984175e-12, 1.46305190e-11])
status: 0
success: True
x: array([2.89971936e-10, 1.32732722e-11, 6.97732845e-12, 1.00000000e+00,
3.28055311e-10, 5.72702383e-12, 1.80418885e-11, 4.61391860e-12,
1.00000000e+00, 2.01674011e-10, 4.58311340e-12, 1.29599793e-11,
2.95298295e-10, 4.34109315e-12, 1.21776975e-11, 3.39951283e-11,
1.00000000e+00, 2.55262044e-10, 4.94703751e-11, 1.00000000e+00,
1.57932544e-11, 9.99999999e-01, 2.21487598e-11, 1.33679145e-11,
2.30514296e-10, 3.91129933e-12, 1.00000000e+00, 1.00000000e+00,
8.19015577e-12, 1.07293976e-11])
Choices:
[[0 0 0]
[1 0 0]
[0 0 1]
[0 0 0]
[0 0 0]
[0 1 0]
[0 1 0]
[1 0 0]
[0 0 1]
[1 0 0]]
这个问题可以用线性规划 (LP) 来解决,但最困难的部分是不知道你应该使用 LP,它是将你的问题转化为 LP-optimization
问题,我会告诉你如何做到这一点。在继续之前,为了简化目的,我将更改您提供的示例数据(因为生成的变量数量巨大),因此,假设我们有以下输入数据:
+------------+---------------------+
| customerid | prob A | prob B |
+------------+---------------------+
| 101 | 0.00317 | 0.061 |
| 102 | 0.00629 | 0.087 |
+------------+---------------------+
假设输入问题的规模为N,其中N代表选择数:
+---------------------+
| prob A | prob B |
+---------------------+
| 0.00317 | 0.061 |
| 0.00629 | 0.087 |
+------------+--------+
由于我们有4种不同的选择,N=4
(其中一些相互排斥并不重要,这样的特征将被约束映射)。在LP中我们有以下事情要处理:
- 一个objective函数C(维度
1x[at least N]
,它是一个行数组), - 约束矩阵
A
(维度取决于您要添加多少限制,您的限制也可能比变量更多)和 - 右 边(我们称之为
b
,它的维度是[number of rows in A]x1
,它是一个 列数组)。
因此,LP 最大化问题将具有以下格式:
Max Cx
subject to:
Ax <= b
x >= 0
请注意,从现在开始,我们将创建 LP 变量来表示我们拥有的输入数据,因此假设 xi
和 input data
之间的映射如下:
+-------------------------------+
| LP variable | Customer | prob |
+-------------------------------+
| x0 | 101 | A |
| x1 | 101 | B |
| x2 | 102 | A |
| x3 | 102 | B |
+-------------------------------+
让我们从并行填充我们的约束矩阵 A
和 RHS b
开始,我们应该创建一个由 two NxN
identity matrices 的列串联形成的矩阵]:
A
+-----------------------------------------------------+
| x0 | x1 | x2 | x3 | x4 | x5 | x6 | x7 | b
+-----------------------------------------------------+ +-----+
row 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | | 1 |
row 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | | 1 |
row 2 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | | 1 |
row 3 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | | 1 |
+-----------------------------------------------------+ +-----+
我们还需要确保每个客户最多选择一个变量(我们输入数据的行),因此我们还为每个客户创建一个额外的变量,在本例中,x8
和 x9
,并在我们将在 A 上创建的相应新 2
行上将它们设置为 1
。此外,新行在映射到每个客户的变量中也必须有 1(简单地看一下所需客户中存在变量的位置)。所以我们将以下 2
行添加到矩阵 A
和列数组 b
:
A
+------------------------------------------------------------------+
| x0 | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | b
+------------------------------------------------------------------+ +-----+
row 4 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | | 1 |
row 5 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | | 1 |
+------------------------------------------------------------------+ +-----+
现在 A 变成:
A
+------------------------------------------------------------------+
| x0 | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | b
+------------------------------------------------------------------+ +-----+
row 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | | 1 |
row 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | | 1 |
row 2 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | | 1 |
row 3 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | | 1 |
row 4 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | | 1 |
row 5 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | | 1 |
+------------------------------------------------------------------+ +-----+
假设我们还想添加一个约束,保证总共最多选择2
个概率,那么我们将第6行第x10
列添加到A中,设置为1个变量从 x0 到 x3 以及 x10
:
A
+------------------------------------------------------------------------+
| x0 | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 | b
+------------------------------------------------------------------------+ +-----+
row 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | | 1 |
row 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | | 1 |
row 2 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | | 1 |
row 3 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | | 1 |
row 4 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | | 1 |
row 5 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | | 1 |
row 6 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | | 2 |
+------------------------------------------------------------------------+ +-----+
请注意,在这个简单的示例中,将选项数量限制为最多 2 个不会影响最终结果。
最后我们构建 objective 函数:
C
+-----------------------------------------------------------------------------------+
| x0 | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 |
+-----------------------------------------------------------------------------------+
| 0.00317 | 0.061 | 0.00629 | 0.087 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+-----------------------------------------------------------------------------------+
创建但没有映射到客户输入数据的变量称为松弛变量,它们的目的是正确构建 LP 问题背后的数学。
既然您知道如何将您的问题建模为 LP 优化问题,您必须选择一种方法来解决问题。我推荐单纯形算法,您可以在 scipy.
找到它在 运行 您的首选求解器之后,您必须解释输出结果。结果应该是包含每个 xi 值的单行数组。我给出的上述示例的输出为:
+------------------------------------------------------------------+
| x0 | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 |
+------------------------------------------------------------------+
| 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+------------------------------------------------------------------+
上面的结果意味着你应该选择由变量x1和x3表示的元素,因为它们被设置为1,i。例如,客户 101 选择了概率 B,客户 102 也选择了概率 B。
Post 脚本:
- 使用
scipy.optimze.linprog
库完成工作后,如果使用上述建模,请确保使用参数 "Aeq" 而不是 "Aub" 作为约束; - 我没有深入研究这个特定问题背后的数学来证明这一点,但是,由于可以从这个问题构建的约束的性质,似乎整数 LP 永远不是必须的;
- objective 函数 C 的系数可以取任何实数,包括负数和 0;和
- 我推荐了 scipy 的 LP 工具,因为我以前使用过它并且效果很好,尽管如此,还有其他免费的很棒的实现可用,比如
glpk
,它可能提供更高级的功能解决您的问题的任何进一步需求的工具。