如何最小化与给定输入分布的距离？

Question

我有一个客户列表，每个客户都可以 "activated" 四种不同的方式：

n= 1000
df = pd.DataFrame(list(range(0,n)), columns = ['Customer_ID'])
df['A'] = np.random.randint(2, size=n)
df['B'] = np.random.randint(2, size=n)
df['C'] = np.random.randint(2, size=n)

每个客户都可以在 "A" 或 "B" 或 "C" 激活，并且仅当与激活类型相关的布尔值等于 1 时。

在输入中我有最终激活的计数。 es:

Target_A = 500
Target_B = 250
Target_C = 250

代码中的随机值是优化器的输入，表示是否可能以这种方式激活客户端。我如何才能将客户与其中一个联系起来以尊重最终目标？如何最小化实际激活计数与输入数据之间的距离？

Answer 1

你们有测试过的例子吗？我认为这可能有效但不确定：

import pandas as pd
import numpy as np
from pulp import LpProblem, LpVariable, LpMinimize, LpInteger, lpSum, value

prob = LpProblem("problem", LpMinimize)


n= 1000
df = pd.DataFrame(list(range(0,n)), columns = ['Customer_ID'])
df['A'] = np.random.randint(2, size=n)
df['B'] = np.random.randint(2, size=n)
df['C'] = np.random.randint(2, size=n)

Target_A = 500
Target_B = 250
Target_C = 250


A = LpVariable.dicts("A", range(0, n), lowBound=0, upBound=1, cat='Boolean')
B = LpVariable.dicts("B", range(0, n), lowBound=0, upBound=1, cat='Boolean')
C = LpVariable.dicts("C", range(0, n), lowBound=0, upBound=1, cat='Boolean')

O1 = LpVariable("O1", cat='Integer')
O2 = LpVariable("O2", cat='Integer')
O3 = LpVariable("O3", cat='Integer')

#objective
prob += O1 + O2 + O3

#constraints
prob += O1 >= Target_A - lpSum(A)
prob += O1 >= lpSum(A) - Target_A
prob += O2 >= Target_B - lpSum(B)
prob += O2 >= lpSum(B) - Target_B
prob += O3 >= Target_C - lpSum(C)
prob += O3 >= lpSum(C) - Target_C

for idx in range(0, n):
    prob += A[idx] + B[idx] + C[idx] <= 1 #cant activate more than 1
    prob += A[idx] <= df['A'][idx] #cant activate if 0
    prob += B[idx] <= df['B'][idx] 
    prob += C[idx] <= df['C'][idx] 

prob.solve()    

print("difference:", prob.objective.value())

如何最小化与给定输入分布的距离？

how can I minimize the distance from a given input distribution?

python

optimization

linear-programming

mixed-integer-programming