具有线性约束的非凸优化
Non-convex optimization with linear constraints
我正在尝试解决类似于下面描述的玩具示例的优化问题。如评论中所述,我当前使用 scipy 的实现速度太慢,而且似乎没有收敛。我怎样才能得到一个体面的解决方案?您可以使用 scipy、mystic 或您认为合适的任何库。
请注意,我不需要全局最小值,搜索可以在 loss(X) <= 1
后立即停止。现实世界中的 objective 主要是用 SQL 编写的,因此速度慢得离谱,所以我也希望优化在 loss
被计算 ~200 次后终止。 (这不是硬性要求,您也可以在优化 运行 持续 5 分钟后终止。)
虽然这个问题类似于 ,但绝对不是重复的。这两个问题的处理方式甚至都不相同 objective.
import numpy as np
from scipy.optimize import differential_evolution, LinearConstraint
# Inhabitants in a rural city are voting for the name of a newborn. The city houses 40 dwarves
# and 10 humans in total, and there are 100 candidate names to vote for.
dwarf_population, human_population = 40, 10
population = dwarf_population + human_population
candidate_count = 100
# Each inhabitant has different number of votes in their hand.
scores_per_citizen = np.random.randint(15, 20, population)
# Let's say the original result looks like this. (Yes, one can cast a fraction of their votes)
alpha = np.abs(np.random.normal(size=candidate_count))
scores = np.diag(scores_per_citizen).dot(np.random.dirichlet(alpha, population))
assert np.allclose(scores.sum(1), scores_per_citizen)
# Here is how the votes are counted.
def count(scores_: np.ndarray) -> np.ndarray:
# Dwarves have a weird tradition: the top 10 popular names chosen by dwarves will get all their votes.
# (I guess this is what makes our objective non-convex.)
scores_by_dwarves = scores_[0:40, :]
score_per_candidate_by_dwarves_raw = scores_by_dwarves.sum(1)
top_10_popular_name_indices = np.argsort(-score_per_candidate_by_dwarves_raw)[:10]
score_per_candidate_by_dwarves = np.zeros(candidate_count)
score_per_candidate_by_dwarves[top_10_popular_name_indices] = score_per_candidate_by_dwarves_raw[
top_10_popular_name_indices]
score_per_candidate_by_dwarves = scores_by_dwarves.sum() \
* score_per_candidate_by_dwarves \
/ score_per_candidate_by_dwarves.sum()
assert np.allclose(score_per_candidate_by_dwarves.sum(), score_per_candidate_by_dwarves_raw.sum())
# Humans, on the other hand, just adds everyone's votes together.
scores_by_humans = scores_[40:, :]
scores_per_candidate_by_humans = scores_by_humans.sum(0)
# The final result is the sum of the scores by dwarves and humans.
return score_per_candidate_by_dwarves + scores_per_candidate_by_humans
# So this is the legitimate result.
scores_per_candidate = count(scores)
# Now, you want to cheat a bit and make your proposal (assuming it's the first one) the most popular one.
target_scores_per_candidate = scores_per_candidate.copy()
argmax = scores_per_candidate.argmax()
target_scores_per_candidate[argmax] = scores_per_candidate[0]
target_scores_per_candidate[0] = scores_per_candidate[argmax]
assert np.allclose(scores_per_candidate.sum(), target_scores_per_candidate.sum())
# However, you cannot just manipulate the result, otherwise the auditors will find out!
# Instead, the raw scores must be manipulated such that your submission ranks top among others.
# You decide to solve for a multiplier to the original scores.
init_coef = np.ones_like(scores).reshape(-1)
# In other words, your goal is to find the minimum of the following objective.
def loss(coef_: np.ndarray) -> float:
scores_ = scores * coef_.reshape(scores.shape)
scores_per_candidate_ = count(scores_)
return ((scores_per_candidate_ - scores_per_candidate) ** 2).sum()
# This is a helper constant matrix. Ignore it for now.
A = np.concatenate([np.tile(np.concatenate([np.repeat(1, candidate_count),
np.repeat(0, population * candidate_count)]),
population - 1),
np.repeat(1, candidate_count)])
A = A.reshape((population, population * candidate_count))
# Meanwhile, some constraints must hold.
def constraints(coef_: np.ndarray):
# The total votes of each citizen must not change.
coef_reshaped = coef_.reshape((population, candidate_count))
assert np.allclose((coef_reshaped * scores).sum(1), scores_per_citizen)
# Another way to express the same thing with matrices.
assert np.allclose(A.dot(np.diag(scores.reshape(-1))).dot(coef_), scores_per_citizen)
# Additionally, all scores must be positive.
assert np.all(coef_reshaped * scores >= 0)
# Let's express the constraint with a LinearConstraint.
score_match_quota = LinearConstraint(A.dot(np.diag(scores.reshape(-1))), scores_per_citizen, scores_per_citizen)
# Run optimization (Spoiler: this is extremely slow, and doesn't seem to converge)
rv = differential_evolution(loss,
bounds=[(0, 1000)] * init_coef.size, # the 1000 here is superficial and can be replaced by other large numbers
init=np.vstack((init_coef, init_coef, init_coef, init_coef, init_coef)),
constraints=score_match_quota)
# Sanity check
constraints(rv.x)
答案与您提到的问题几乎相同...但是,我必须进一步考虑一些限制才能证明这一点。让我重写您的代码 -- 只是使用一些较短的名称。
>>> import mystic as my
>>> import numpy as np
>>>
>>> # Inhabitants in a rural city are voting for the name of a newborn.
... # The city houses 40 dwarves and 10 humans in total, and there are
... # 100 candidate names to vote for.
... dwarves, humans = 40, 10
>>> citizens = dwarves + humans
>>> names = 100
>>> # Each inhabitant has different number of votes in their hand.
... votes_per_citizen = np.random.randint(15, 20, citizens)
>>>
>>> # Let's say the original result looks like this.
... # (Yes, one can cast a fraction of their votes)
... alpha = np.abs(np.random.normal(size=names))
>>> votes = np.diag(votes_per_citizen).dot(np.random.dirichlet(alpha, citizens))
>>> # NOTE: assert np.allclose(votes.sum(1), votes_per_citizen)
...
>>> # Here is how the votes are counted.
... def count(votes): #NOTE: votes.shape is (citizens, names)
... # Dwarves have a weird tradition: the top 10 popular names chosen
... # by dwarves will get all their votes.
... # (I guess this is what makes our objective non-convex.)
... dwarf_votes = votes[:dwarves]
... dwarf_votes_per_name_ = dwarf_votes.sum(1)
... top_10_idx = np.argsort(-dwarf_votes_per_name_)[:10]
... dwarf_votes_per_name = np.zeros(names)
... dwarf_votes_per_name[top_10_idx] = dwarf_votes_per_name_[top_10_idx]
... dwarf_votes_per_name = \
... dwarf_votes.sum() * dwarf_votes_per_name / dwarf_votes_per_name.sum()
... #NOTE: assert np.allclose(dwarf_votes_per_name.sum(), dwarf_votes_per_name_.sum())
... # Humans, on the other hand, just add everyone's votes together.
... human_votes = votes[dwarves:]
... human_votes_per_name = human_votes.sum(0)
... # The final result is the sum of the scores by dwarves and humans.
... return dwarf_votes_per_name + human_votes_per_name #NOTE: shape = (names,)
...
>>> # So this is the legitimate result.
... votes_per_name = count(votes)
>>>
>>> # Now, you want to cheat a bit and make your proposal
... # (assuming it's the first one) the most popular one.
... votes_per_name_ = votes_per_name.copy()
>>> argmax = votes_per_name.argmax()
>>> votes_per_name_[argmax] = votes_per_name[0]
>>> votes_per_name_[0] = votes_per_name[argmax]
>>> #NOTE: assert np.allclose(votes_per_name.sum(), votes_per_name_.sum())
...
>>> # However, you cannot just manipulate the result, otherwise the auditors
... # will find out! Instead, the raw scores must be manipulated such that your
... # submission ranks top among others. You decide to solve for a multiplier
... # to the original scores.
... coef = np.ones_like(votes).reshape(-1)
>>>
>>> # In other words, your goal is to find the minimum of the following objective.
... def loss(coef): #NOTE: coef.shape is (citizens*votes,)
... votes_ = votes * coef.reshape(votes.shape)
... votes_per_name_ = count(votes_)
... return ((votes_per_name_ - votes_per_name)**2).sum()
...
>>>
>>> # This is a helper constant matrix. Ignore it for now.
... A = np.concatenate([np.tile(np.concatenate([np.repeat(1, names), np.repeat(0, citizens * names)]), citizens - 1), np.repeat(1, names)]).reshape((citizens, citizens * names))
>>> A_ = A.dot(np.diag(votes.reshape(-1)))
>>>
到目前为止,代码与您的问题相同。
现在,让我们使用一些 mystic 的工具。
首先,让我们把它放在你提到的问题的上下文中。
>>> # Build constraints
... cons = my.symbolic.linear_symbolic(A_, votes_per_citizen)
>>> cons = my.symbolic.solve(cons) #NOTE: this may take a while...
>>> cons = my.symbolic.generate_constraint(my.symbolic.generate_solvers(cons))
>>>
>>> def cons_(x): #NOTE: preserve x
... return cons(x.copy())
...
>>>
现在,就像在引用的问题中一样,我们可以使用像 fmin_powell
和 constraints=cons_
这样的简单求解器来解决它。然而,我发现 mystic 在这里可能会有点困难,我将展示这一点。
>>> bounds = [(0, 1000)] * coef.size
>>> x0 = np.random.randint(0, 1000, coef.shape).astype(float).tolist()
>>>
>>> stepmon = my.monitors.VerboseMonitor(1)
>>> rv = my.solvers.fmin_powell(loss, x0=x0, bounds=bounds, itermon=stepmon,
... constraints=cons_,
... disp=1, maxfun=20000, gtol=None, ftol=500)
Generation 0 has ChiSquare: inf
Generation 1 has ChiSquare: inf
Generation 2 has ChiSquare: inf
Generation 3 has ChiSquare: inf
Generation 4 has ChiSquare: inf
Generation 5 has ChiSquare: inf
Generation 6 has ChiSquare: inf
Generation 7 has ChiSquare: inf
...
我在这里取消了优化......当你看到inf
时,这意味着神秘主义者正在努力解决约束。
问题是应该有额外的约束来帮助它,即 A_.dot(cons_(x0))
应该是整数。此外,最好能更好地控制底片的存在。
>>> np.round(A_.dot(cons_(x0)), 10) - votes_per_citizen
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0.])
>>> cons_(x0)[:5]
[-574537.1464429945, 98.0, 326.0, 114.0, 694.0]
事实上,每个 citizens
元素都是负数。
所以,我尝试使用惩罚来代替。
>>> def penalty1(x):
... return np.abs((np.array(x).reshape(citizens, names) * votes).sum(1) - votes_per_citizen).sum()
...
>>> def penalty2(x):
... return np.abs(A_.dot(x) - votes_per_citizen).sum()
...
>>> def penalty3(x):
... return np.abs((np.array(x).reshape(citizens, names) * votes).min(1).sum())
...
>>> #FIXME: utilize that A_.dot(x) should be integer-valued
...
>>> @my.penalty.quadratic_equality(penalty2)
... @my.penalty.quadratic_equality(penalty3)
... def penalty(x):
... return 0.0
...
>>>
>>> bounds = [(0, 1000)] * coef.size
>>> x0 = np.random.randint(0, 1000, coef.shape).astype(float).tolist()
>>>
>>> stepmon = my.monitors.VerboseMonitor(1)
>>>
>>> rv = my.solvers.fmin_powell(loss, x0=x0, bounds=bounds, itermon=stepmon,
... penalty=penalty, # constraints=cons_,
... disp=1, maxfun=20000, gtol=None, ftol=500)
Generation 0 has ChiSquare: 4316466725165.325684
Generation 1 has ChiSquare: 97299808.307906
Generation 2 has ChiSquare: 1125.438322
Generation 3 has ChiSquare: 1116.735393
Warning: Maximum number of function evaluations has been exceeded.
STOP("EvaluationLimits with {'evaluations': 20000, 'generations': 500000}")
>>>
这似乎工作得很好(注意我使用了 20000 个 evals 而不是 200 个,而 ftol 在损失为 500 而不是 1 时停止)。 rv
非常接近,优化也相当快。
>>> penalty(rv)
4.979032041874226
>>> np.round(A_.dot(rv), 2) - votes_per_citizen
array([0. , 0.22, 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ])
>>>
>>> penalty(x0)
4313023244843.466
>>> np.round(A_.dot(x0), 2) - votes_per_citizen
array([ 6996.61, 5872.2 , 6398.82, 7593.65, 9137.81, 10347.84,
9204.44, 6160.77, 9626.64, 7572.4 , 10771.24, 8673.7 ,
10212.18, 7581.45, 5097.13, 7631.2 , 8274.92, 9684.09,
9504.27, 9067.73, 7332.77, 10214.02, 8255.38, 9853.74,
6613.19])
在这里,A_.dot(rv)
不那么准确(四舍五入是 2 个位置而不是 10 个)...并且将再次受益于使 A_.dot(rv)
整数的约束。
我会把它留给以后的例子。
我正在尝试解决类似于下面描述的玩具示例的优化问题。如评论中所述,我当前使用 scipy 的实现速度太慢,而且似乎没有收敛。我怎样才能得到一个体面的解决方案?您可以使用 scipy、mystic 或您认为合适的任何库。
请注意,我不需要全局最小值,搜索可以在 loss(X) <= 1
后立即停止。现实世界中的 objective 主要是用 SQL 编写的,因此速度慢得离谱,所以我也希望优化在 loss
被计算 ~200 次后终止。 (这不是硬性要求,您也可以在优化 运行 持续 5 分钟后终止。)
虽然这个问题类似于
import numpy as np
from scipy.optimize import differential_evolution, LinearConstraint
# Inhabitants in a rural city are voting for the name of a newborn. The city houses 40 dwarves
# and 10 humans in total, and there are 100 candidate names to vote for.
dwarf_population, human_population = 40, 10
population = dwarf_population + human_population
candidate_count = 100
# Each inhabitant has different number of votes in their hand.
scores_per_citizen = np.random.randint(15, 20, population)
# Let's say the original result looks like this. (Yes, one can cast a fraction of their votes)
alpha = np.abs(np.random.normal(size=candidate_count))
scores = np.diag(scores_per_citizen).dot(np.random.dirichlet(alpha, population))
assert np.allclose(scores.sum(1), scores_per_citizen)
# Here is how the votes are counted.
def count(scores_: np.ndarray) -> np.ndarray:
# Dwarves have a weird tradition: the top 10 popular names chosen by dwarves will get all their votes.
# (I guess this is what makes our objective non-convex.)
scores_by_dwarves = scores_[0:40, :]
score_per_candidate_by_dwarves_raw = scores_by_dwarves.sum(1)
top_10_popular_name_indices = np.argsort(-score_per_candidate_by_dwarves_raw)[:10]
score_per_candidate_by_dwarves = np.zeros(candidate_count)
score_per_candidate_by_dwarves[top_10_popular_name_indices] = score_per_candidate_by_dwarves_raw[
top_10_popular_name_indices]
score_per_candidate_by_dwarves = scores_by_dwarves.sum() \
* score_per_candidate_by_dwarves \
/ score_per_candidate_by_dwarves.sum()
assert np.allclose(score_per_candidate_by_dwarves.sum(), score_per_candidate_by_dwarves_raw.sum())
# Humans, on the other hand, just adds everyone's votes together.
scores_by_humans = scores_[40:, :]
scores_per_candidate_by_humans = scores_by_humans.sum(0)
# The final result is the sum of the scores by dwarves and humans.
return score_per_candidate_by_dwarves + scores_per_candidate_by_humans
# So this is the legitimate result.
scores_per_candidate = count(scores)
# Now, you want to cheat a bit and make your proposal (assuming it's the first one) the most popular one.
target_scores_per_candidate = scores_per_candidate.copy()
argmax = scores_per_candidate.argmax()
target_scores_per_candidate[argmax] = scores_per_candidate[0]
target_scores_per_candidate[0] = scores_per_candidate[argmax]
assert np.allclose(scores_per_candidate.sum(), target_scores_per_candidate.sum())
# However, you cannot just manipulate the result, otherwise the auditors will find out!
# Instead, the raw scores must be manipulated such that your submission ranks top among others.
# You decide to solve for a multiplier to the original scores.
init_coef = np.ones_like(scores).reshape(-1)
# In other words, your goal is to find the minimum of the following objective.
def loss(coef_: np.ndarray) -> float:
scores_ = scores * coef_.reshape(scores.shape)
scores_per_candidate_ = count(scores_)
return ((scores_per_candidate_ - scores_per_candidate) ** 2).sum()
# This is a helper constant matrix. Ignore it for now.
A = np.concatenate([np.tile(np.concatenate([np.repeat(1, candidate_count),
np.repeat(0, population * candidate_count)]),
population - 1),
np.repeat(1, candidate_count)])
A = A.reshape((population, population * candidate_count))
# Meanwhile, some constraints must hold.
def constraints(coef_: np.ndarray):
# The total votes of each citizen must not change.
coef_reshaped = coef_.reshape((population, candidate_count))
assert np.allclose((coef_reshaped * scores).sum(1), scores_per_citizen)
# Another way to express the same thing with matrices.
assert np.allclose(A.dot(np.diag(scores.reshape(-1))).dot(coef_), scores_per_citizen)
# Additionally, all scores must be positive.
assert np.all(coef_reshaped * scores >= 0)
# Let's express the constraint with a LinearConstraint.
score_match_quota = LinearConstraint(A.dot(np.diag(scores.reshape(-1))), scores_per_citizen, scores_per_citizen)
# Run optimization (Spoiler: this is extremely slow, and doesn't seem to converge)
rv = differential_evolution(loss,
bounds=[(0, 1000)] * init_coef.size, # the 1000 here is superficial and can be replaced by other large numbers
init=np.vstack((init_coef, init_coef, init_coef, init_coef, init_coef)),
constraints=score_match_quota)
# Sanity check
constraints(rv.x)
答案与您提到的问题几乎相同...但是,我必须进一步考虑一些限制才能证明这一点。让我重写您的代码 -- 只是使用一些较短的名称。
>>> import mystic as my
>>> import numpy as np
>>>
>>> # Inhabitants in a rural city are voting for the name of a newborn.
... # The city houses 40 dwarves and 10 humans in total, and there are
... # 100 candidate names to vote for.
... dwarves, humans = 40, 10
>>> citizens = dwarves + humans
>>> names = 100
>>> # Each inhabitant has different number of votes in their hand.
... votes_per_citizen = np.random.randint(15, 20, citizens)
>>>
>>> # Let's say the original result looks like this.
... # (Yes, one can cast a fraction of their votes)
... alpha = np.abs(np.random.normal(size=names))
>>> votes = np.diag(votes_per_citizen).dot(np.random.dirichlet(alpha, citizens))
>>> # NOTE: assert np.allclose(votes.sum(1), votes_per_citizen)
...
>>> # Here is how the votes are counted.
... def count(votes): #NOTE: votes.shape is (citizens, names)
... # Dwarves have a weird tradition: the top 10 popular names chosen
... # by dwarves will get all their votes.
... # (I guess this is what makes our objective non-convex.)
... dwarf_votes = votes[:dwarves]
... dwarf_votes_per_name_ = dwarf_votes.sum(1)
... top_10_idx = np.argsort(-dwarf_votes_per_name_)[:10]
... dwarf_votes_per_name = np.zeros(names)
... dwarf_votes_per_name[top_10_idx] = dwarf_votes_per_name_[top_10_idx]
... dwarf_votes_per_name = \
... dwarf_votes.sum() * dwarf_votes_per_name / dwarf_votes_per_name.sum()
... #NOTE: assert np.allclose(dwarf_votes_per_name.sum(), dwarf_votes_per_name_.sum())
... # Humans, on the other hand, just add everyone's votes together.
... human_votes = votes[dwarves:]
... human_votes_per_name = human_votes.sum(0)
... # The final result is the sum of the scores by dwarves and humans.
... return dwarf_votes_per_name + human_votes_per_name #NOTE: shape = (names,)
...
>>> # So this is the legitimate result.
... votes_per_name = count(votes)
>>>
>>> # Now, you want to cheat a bit and make your proposal
... # (assuming it's the first one) the most popular one.
... votes_per_name_ = votes_per_name.copy()
>>> argmax = votes_per_name.argmax()
>>> votes_per_name_[argmax] = votes_per_name[0]
>>> votes_per_name_[0] = votes_per_name[argmax]
>>> #NOTE: assert np.allclose(votes_per_name.sum(), votes_per_name_.sum())
...
>>> # However, you cannot just manipulate the result, otherwise the auditors
... # will find out! Instead, the raw scores must be manipulated such that your
... # submission ranks top among others. You decide to solve for a multiplier
... # to the original scores.
... coef = np.ones_like(votes).reshape(-1)
>>>
>>> # In other words, your goal is to find the minimum of the following objective.
... def loss(coef): #NOTE: coef.shape is (citizens*votes,)
... votes_ = votes * coef.reshape(votes.shape)
... votes_per_name_ = count(votes_)
... return ((votes_per_name_ - votes_per_name)**2).sum()
...
>>>
>>> # This is a helper constant matrix. Ignore it for now.
... A = np.concatenate([np.tile(np.concatenate([np.repeat(1, names), np.repeat(0, citizens * names)]), citizens - 1), np.repeat(1, names)]).reshape((citizens, citizens * names))
>>> A_ = A.dot(np.diag(votes.reshape(-1)))
>>>
到目前为止,代码与您的问题相同。 现在,让我们使用一些 mystic 的工具。
首先,让我们把它放在你提到的问题的上下文中。
>>> # Build constraints
... cons = my.symbolic.linear_symbolic(A_, votes_per_citizen)
>>> cons = my.symbolic.solve(cons) #NOTE: this may take a while...
>>> cons = my.symbolic.generate_constraint(my.symbolic.generate_solvers(cons))
>>>
>>> def cons_(x): #NOTE: preserve x
... return cons(x.copy())
...
>>>
现在,就像在引用的问题中一样,我们可以使用像 fmin_powell
和 constraints=cons_
这样的简单求解器来解决它。然而,我发现 mystic 在这里可能会有点困难,我将展示这一点。
>>> bounds = [(0, 1000)] * coef.size
>>> x0 = np.random.randint(0, 1000, coef.shape).astype(float).tolist()
>>>
>>> stepmon = my.monitors.VerboseMonitor(1)
>>> rv = my.solvers.fmin_powell(loss, x0=x0, bounds=bounds, itermon=stepmon,
... constraints=cons_,
... disp=1, maxfun=20000, gtol=None, ftol=500)
Generation 0 has ChiSquare: inf
Generation 1 has ChiSquare: inf
Generation 2 has ChiSquare: inf
Generation 3 has ChiSquare: inf
Generation 4 has ChiSquare: inf
Generation 5 has ChiSquare: inf
Generation 6 has ChiSquare: inf
Generation 7 has ChiSquare: inf
...
我在这里取消了优化......当你看到inf
时,这意味着神秘主义者正在努力解决约束。
问题是应该有额外的约束来帮助它,即 A_.dot(cons_(x0))
应该是整数。此外,最好能更好地控制底片的存在。
>>> np.round(A_.dot(cons_(x0)), 10) - votes_per_citizen
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0.])
>>> cons_(x0)[:5]
[-574537.1464429945, 98.0, 326.0, 114.0, 694.0]
事实上,每个 citizens
元素都是负数。
所以,我尝试使用惩罚来代替。
>>> def penalty1(x):
... return np.abs((np.array(x).reshape(citizens, names) * votes).sum(1) - votes_per_citizen).sum()
...
>>> def penalty2(x):
... return np.abs(A_.dot(x) - votes_per_citizen).sum()
...
>>> def penalty3(x):
... return np.abs((np.array(x).reshape(citizens, names) * votes).min(1).sum())
...
>>> #FIXME: utilize that A_.dot(x) should be integer-valued
...
>>> @my.penalty.quadratic_equality(penalty2)
... @my.penalty.quadratic_equality(penalty3)
... def penalty(x):
... return 0.0
...
>>>
>>> bounds = [(0, 1000)] * coef.size
>>> x0 = np.random.randint(0, 1000, coef.shape).astype(float).tolist()
>>>
>>> stepmon = my.monitors.VerboseMonitor(1)
>>>
>>> rv = my.solvers.fmin_powell(loss, x0=x0, bounds=bounds, itermon=stepmon,
... penalty=penalty, # constraints=cons_,
... disp=1, maxfun=20000, gtol=None, ftol=500)
Generation 0 has ChiSquare: 4316466725165.325684
Generation 1 has ChiSquare: 97299808.307906
Generation 2 has ChiSquare: 1125.438322
Generation 3 has ChiSquare: 1116.735393
Warning: Maximum number of function evaluations has been exceeded.
STOP("EvaluationLimits with {'evaluations': 20000, 'generations': 500000}")
>>>
这似乎工作得很好(注意我使用了 20000 个 evals 而不是 200 个,而 ftol 在损失为 500 而不是 1 时停止)。 rv
非常接近,优化也相当快。
>>> penalty(rv)
4.979032041874226
>>> np.round(A_.dot(rv), 2) - votes_per_citizen
array([0. , 0.22, 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. ])
>>>
>>> penalty(x0)
4313023244843.466
>>> np.round(A_.dot(x0), 2) - votes_per_citizen
array([ 6996.61, 5872.2 , 6398.82, 7593.65, 9137.81, 10347.84,
9204.44, 6160.77, 9626.64, 7572.4 , 10771.24, 8673.7 ,
10212.18, 7581.45, 5097.13, 7631.2 , 8274.92, 9684.09,
9504.27, 9067.73, 7332.77, 10214.02, 8255.38, 9853.74,
6613.19])
在这里,A_.dot(rv)
不那么准确(四舍五入是 2 个位置而不是 10 个)...并且将再次受益于使 A_.dot(rv)
整数的约束。
我会把它留给以后的例子。