集合划分比差分得到更好的结果

Question

Partition problem 已知是 NP-hard。根据问题的特定实例，我们可以尝试动态规划或一些启发式算法，如差分法（也称为 Karmarkar-Karp 算法）。

后者似乎对大数实例非常有用（这使得动态规划难以处理），但并不总是完美的。找到更好解决方案的有效方法是什么（随机、禁忌搜索、其他近似）？

PS: 这个问题背后有一些故事。自 2004 年 7 月以来，SPOJ 有一项挑战 Johnny Goes Shopping。到目前为止，已有 1087 名用户解决了该挑战，但其中只有 11 人的得分高于正确的 Karmarkar-Karp 算法实施（根据当前得分，Karmarkar-Karp给出 11.796614 点）。如何做得更好？（最想要的已接受提交支持的答案，但请不要透露您的代码。）

Answer 1

编辑这是一个从 Karmarkar-Karp 差分开始然后尝试优化结果分区的实现。

时间允许的唯一优化是将 1 从一个分区分配给另一个分区并在两个分区之间交换 1 对 1。

我一开始对 Karmarkar-Karp 的实施一定是不准确的，因为仅使用 Karmarkar-Karp 得到的分数是 2.711483 而不是 11.796614 OP[ 引用的分数=25=]。使用优化后，分数达到 7.718049。

剧透警告 C# 提交代码如下

using System; using System.Collections.Generic; using System.Linq; public class Test { // some comparer's to lazily avoid using a proper max-heap implementation public class Index0 : IComparer<long[]> { public int Compare(long[] x, long[] y) { if(x[0] == y[0]) return 0; return x[0] < y[0] ? -1 : 1; } public static Index0 Inst = new Index0(); } public class Index1 : IComparer<long[]> { public int Compare(long[] x, long[] y) { if(x[1] == y[1]) return 0; return x[1] < y[1] ? -1 : 1; } } public static void Main() { // load the data var start = DateTime.Now; var list = new List<long[]>(); int size = int.Parse(Console.ReadLine()); for(int i=1; i<=size; i++) { var tuple = new long[]{ long.Parse(Console.ReadLine()), i }; list.Add(tuple); } list.Sort((x, y) => { if(x[0] == y[0]) return 0; return x[0] < y[0] ? -1 : 1; }); // Karmarkar-Karp differences List<long[]> diffs = new List<long[]>(); while(list.Count > 1) { // get max var b = list[list.Count - 1]; list.RemoveAt(list.Count - 1); // get max var a = list[list.Count - 1]; list.RemoveAt(list.Count - 1); // (b - a) var diff = b[0] - a[0]; var tuple = new long[]{ diff, -1 }; diffs.Add(new long[] { a[0], b[0], diff, a[1], b[1] }); // insert (b - a) back in var fnd = list.BinarySearch(tuple, new Index0()); list.Insert(fnd < 0 ? ~fnd : fnd, tuple); } var approx = list[0]; list.Clear(); // setup paritions var listA = new List<long[]>(); var listB = new List<long[]>(); long sumA = 0; long sumB = 0; // Karmarkar-Karp rebuild partitions from differences bool toggle = false; for(int i=diffs.Count-1; i>=0; i--) { var inB = listB.BinarySearch(new long[]{diffs[i][2]}, Index0.Inst); var inA = listA.BinarySearch(new long[]{diffs[i][2]}, Index0.Inst); if(inB >= 0 && inA >= 0) { toggle = !toggle; } if(toggle == false) { if(inB >= 0) { listB.RemoveAt(inB); }else if(inA >= 0) { listA.RemoveAt(inA); } var tb = new long[]{diffs[i][1], diffs[i][4]}; var ta = new long[]{diffs[i][0], diffs[i][3]}; var fb = listB.BinarySearch(tb, Index0.Inst); var fa = listA.BinarySearch(ta, Index0.Inst); listB.Insert(fb < 0 ? ~fb : fb, tb); listA.Insert(fa < 0 ? ~fa : fa, ta); } else { if(inA >= 0) { listA.RemoveAt(inA); }else if(inB >= 0) { listB.RemoveAt(inB); } var tb = new long[]{diffs[i][1], diffs[i][4]}; var ta = new long[]{diffs[i][0], diffs[i][3]}; var fb = listA.BinarySearch(tb, Index0.Inst); var fa = listB.BinarySearch(ta, Index0.Inst); listA.Insert(fb < 0 ? ~fb : fb, tb); listB.Insert(fa < 0 ? ~fa : fa, ta); } } listA.ForEach(a => sumA += a[0]); listB.ForEach(b => sumB += b[0]); // optimize our partitions with give/take 1 or swap 1 for 1 bool change = false; while(DateTime.Now.Subtract(start).TotalSeconds < 4.8) { change = false; // give one from A to B for(int i=0; i<listA.Count; i++) { var a = listA[i]; if(Math.Abs(sumA - sumB) > Math.Abs((sumA - a[0]) - (sumB + a[0]))) { var fb = listB.BinarySearch(a, Index0.Inst); listB.Insert(fb < 0 ? ~fb : fb, a); listA.RemoveAt(i); i--; sumA -= a[0]; sumB += a[0]; change = true; } else {break;} } // give one from B to A for(int i=0; i<listB.Count; i++) { var b = listB[i]; if(Math.Abs(sumA - sumB) > Math.Abs((sumA + b[0]) - (sumB - b[0]))) { var fa = listA.BinarySearch(b, Index0.Inst); listA.Insert(fa < 0 ? ~fa : fa, b); listB.RemoveAt(i); i--; sumA += b[0]; sumB -= b[0]; change = true; } else {break;} } // swap 1 for 1 for(int i=0; i<listA.Count; i++) { var a = listA[i]; for(int j=0; j<listB.Count; j++) { var b = listB[j]; if(Math.Abs(sumA - sumB) > Math.Abs((sumA - a[0] + b[0]) - (sumB -b[0] + a[0]))) { listA.RemoveAt(i); listB.RemoveAt(j); var fa = listA.BinarySearch(b, Index0.Inst); var fb = listB.BinarySearch(a, Index0.Inst); listA.Insert(fa < 0 ? ~fa : fa, b); listB.Insert(fb < 0 ? ~fb : fb, a); sumA = sumA - a[0] + b[0]; sumB = sumB - b[0] + a[0]; change = true; break; } } } // if(change == false) { break; } } /* // further optimization with 2 for 1 swaps while(DateTime.Now.Subtract(start).TotalSeconds < 4.8) { change = false; // trade 2 for 1 for(int i=0; i<listA.Count >> 1; i++) { var a1 = listA[i]; var a2 = listA[listA.Count - 1 - i]; for(int j=0; j<listB.Count; j++) { var b = listB[j]; if(Math.Abs(sumA - sumB) > Math.Abs((sumA - a1[0] - a2[0] + b[0]) - (sumB - b[0] + a1[0] + a2[0]))) { listA.RemoveAt(listA.Count - 1 - i); listA.RemoveAt(i); listB.RemoveAt(j); var fa = listA.BinarySearch(b, Index0.Inst); var fb1 = listB.BinarySearch(a1, Index0.Inst); var fb2 = listB.BinarySearch(a2, Index0.Inst); listA.Insert(fa < 0 ? ~fa : fa, b); listB.Insert(fb1 < 0 ? ~fb1 : fb1, a1); listB.Insert(fb2 < 0 ? ~fb2 : fb2, a2); sumA = sumA - a1[0] - a2[0] + b[0]; sumB = sumB - b[0] + a1[0] + a2[0]; change = true; break; } } } // if(DateTime.Now.Subtract(start).TotalSeconds > 4.8) { break; } // trade 2 for 1 for(int i=0; i<listB.Count >> 1; i++) { var b1 = listB[i]; var b2 = listB[listB.Count - 1 - i]; for(int j=0; j<listA.Count; j++) { var a = listA[j]; if(Math.Abs(sumA - sumB) > Math.Abs((sumA - a[0] + b1[0] + b2[0]) - (sumB - b1[0] - b2[0] + a[0]))) { listB.RemoveAt(listB.Count - 1 - i); listB.RemoveAt(i); listA.RemoveAt(j); var fa1 = listA.BinarySearch(b1, Index0.Inst); var fa2 = listA.BinarySearch(b2, Index0.Inst); var fb = listB.BinarySearch(a, Index0.Inst); listA.Insert(fa1 < 0 ? ~fa1 : fa1, b1); listA.Insert(fa2 < 0 ? ~fa2 : fa2, b2); listB.Insert(fb < 0 ? ~fb : fb, a); sumA = sumA - a[0] + b1[0] + b2[0]; sumB = sumB - b1[0] - b2[0] + a[0]; change = true; break; } } } // if(change == false) { break; } } */ // output the correct ordered values listA.Sort(new Index1()); foreach(var t in listA) { Console.WriteLine(t[1]); } // DEBUG/TESTING //Console.WriteLine(approx[0]); //foreach(var t in listA) Console.Write(": " + t[0] + "," + t[1]); //Console.WriteLine(); //foreach(var t in listB) Console.Write(": " + t[0] + "," + t[1]); } }

Answer 2

有许多论文描述了集合划分的各种高级算法。这里只有两个：

"A complete anytime algorithm for number partitioning" 作者：Richard E. Korf。
"An efficient fully polynomial approximation scheme for the Subset-Sum Problem" 作者：Hans Kellerer 等人

老实说，我不知道他们中哪一个给出了更有效的解决方案。可能不需要这些高级算法来解决该 SPOJ 问题。 Korf 的论文还是很有用的。那里描述的算法非常简单（理解和实现）。他还概述了几个更简单的算法（在第 2 节中）。因此，如果您想了解 Horowitz-Sahni 或 Schroeppel-Shamir 方法（下文提到）的详细信息，可以在 Korf 的论文中找到它们。此外（在第 8 节中）他写道，随机方法并不能保证有足够好的解决方案。因此，您不太可能通过爬山、模拟退火或禁忌搜索等方法获得重大改进。

我尝试了几种简单的算法及其组合来解决大小最大为 10000、最大值最大为 10¹⁴、时间限制为 4 秒的分区问题。他们在随机均匀分布的数字上进行了测试。并且为我尝试过的每个问题实例都找到了最佳解决方案。对于某些问题实例，最优性由算法保证，对于其他问题实例，最优性不是 100% 保证的，但获得次优解的概率非常小。

对于最大 4 的尺寸（左侧的绿色区域），Karmarkar-Karp 算法始终给出最佳结果。

对于最大 54 的大小，蛮力算法足够快（红色区域）。可以在 Horowitz-Sahni 或 Schroeppel-Shamir 算法之间进行选择。我使用 Horowitz-Sahni 是因为它在给定的限制下似乎更有效。 Schroeppel-Shamir 使用的内存少得多（所有内容都适合 L2 缓存），因此当其他 CPU 核心执行一些内存密集型任务或使用多线程进行集合分区时，它可能更可取。或者在没有那么严格的时间限制的情况下解决更大的问题（Horowitz-Sahni 只是运行s 内存不足）。

当大小乘以所有值之和小于5*10⁹（蓝色区域）时，适用动态规划方法。图表中蛮力和动态规划区域之间的边界显示了每种算法的性能更好的地方。

右边的绿色区域是 Karmarkar-Karp 算法以几乎 100% 的概率给出最佳结果的地方。这里有如此多的完美分区选项（增量为 0 或 1），以至于 Karmarkar-Karp 算法几乎肯定能找到其中之一。可以发明 Karmarkar-Karp 总是给出次优结果的数据集。例如{17 13 10 10 10 ...}。如果将其乘以某个大数，则 KK 和 DP 都无法找到最优解。幸运的是，这样的数据集在实践中是不太可能的。但是问题 setter 可以添加这样的数据集来增加比赛的难度。在这种情况下，您可以选择一些高级算法以获得更好的结果（但仅适用于图表上的灰色和右侧绿色区域）。

我尝试了 2 种方法来实现 Karmarkar-Karp 算法的优先级队列：最大堆和排序数组。排序数组选项在使用线性搜索时似乎稍快，而在二进制搜索中明显更快。

黄色区域是您可以在保证最佳结果（使用 DP）或仅具有高概率的最佳结果（使用 Karmarkar-Karp）之间进行选择的地方。

最后是灰色区域，其中任何一种简单算法本身都无法给出最佳结果。在这里，我们可以使用 Karmarkar-Karp 预处理数据，直到它适用于 Horowitz-Sahni 或动态规划。这个地方也有很多完美的分区选项，但比绿色区域少，所以 Karmarkar-Karp 本身有时会错过适当的分区。更新：正如@mhum 所指出的，没有必要实施动态编程算法来使事情正常进行。带有 Karmarkar-Karp 预处理的 Horowitz-Sahni 就足够了。但 Horowitz-Sahni 算法必须在所述时间限制内处理最大 54 的大小，以（几乎）保证最佳分区。所以C++或其他具有良好优化编译器和快速计算机的语言是更可取的。

以下是我将 Karmarkar-Karp 与其他算法相结合的方式：

template<bool Preprocess = false>
i64 kk(const vector<i64>& values, i64 sum, Log& log)
{
    log.name("Karmarkar-Karp");
    vector<i64> pq(values.size() * 2);
    copy(begin(values), end(values), begin(pq) + values.size());
    sort(begin(pq) + values.size(), end(pq));
    auto first = end(pq);
    auto last = begin(pq) + values.size();

    while (first - last > 1)
    {
        if (Preprocess && first - last <= kHSLimit)
        {
            hs(last, first, sum, log);
            return 0;
        }
        if (Preprocess && static_cast<double>(first - last) * sum <= kDPLimit)
        {
            dp(last, first, sum, log);
            return 0;
        }
        const auto diff = *(first - 1) - *(first - 2);
        sum -= *(first - 2) * 2;
        first -= 2;
        const auto place = lower_bound(last, first, diff);
        --last;
        copy(last + 1, place, last);
        *(place - 1) = diff;
    }

    const auto result = (first - last)? *last: 0;
    log(result);
    return result;
}

Link to full C++11 implementation. 该程序仅确定分区总和之间的差异，不报告分区本身。 警告：如果您想在可用内存少于 1 Gb 的计算机上运行它，减少kHSLimit常量。

Answer 3

无论其价值如何，在 [Korf88] 中 "complete Karmarkar Karp" (CKK) 搜索过程的直接、未优化的 Python 实现 -- 仅略微修改以在给定的搜索后退出搜索时间限制（比如 4.95 秒）和 return 迄今为止找到的最佳解决方案 - 足以在 SPOJ 问题上得分 14.204234，超过 Karmarkar-Karp 的得分。 ~~在撰写本文时，这是 #3 on the rankings~~（请参阅下面的编辑 #2）

可以在 [Mert99] 中找到 Korf 的 CKK 算法的更易读的介绍。

编辑 #2 - 我已经实现了 hybrid heuristic of applying Karmarkar-Karp until the list of numbers is below some threshold and then switching over to the exact Horowitz-Sahni subset enumeration method [HS74] (a concise description may be found in [Korf88]). As suspected, my Python implementation required lowering the switchover threshold versus his C++ implementation. With some trial and error, I found that a threshold of 37 was the maximum that allowed my program to finish within the time limit. Yet, even at that lower threshold, I was able to achieve a score of 15.265633, good enough for second place。

我进一步尝试将这种混合 KK/HS 方法合并到 CKK 树搜索中，主要是通过使用 HS 作为一种非常激进且昂贵的修剪策略。在普通 CKK 中，我无法找到甚至与 KK/HS 方法匹配的切换阈值。然而，使用 CKK 和 HS（阈值为 25）的 ILDS（见下文）搜索策略进行修剪，我能够比之前的分数获得非常小的收益，最高可达 15.272802.在这种情况下，CKK+ILDS 优于普通 CKK 可能不足为奇，因为它会通过设计为 HS 阶段提供更多样化的输入。

编辑 #1 - 我尝试了对基本 CKK 算法的两个进一步改进：

"Improved Limited Discrepancy Search" (ILDS) [Korf96] 这是搜索树中路径的自然 DFS 排序的替代方法。它倾向于比常规深度优先搜索更早地探索更多不同的解决方案。
"Speeding up 2-Way Number Partitioning" [Cerq12] 这将CKK中的一个修剪标准从叶节点的4层内的节点推广到叶节点上方的5、6和7层内的节点。

在我的测试案例中，这两项改进通常都比原始 CKK 提供了显着的好处，减少了探索的节点数量（在后者的情况下）和更快地找到更好的解决方案（在以前的）。然而，在 SPOJ 问题结构的范围内，这些都不足以提高我的分数。

鉴于此 SPOJ 问题的特殊性质（即：5 秒的时间限制和只有一个特定且未公开的问题实例），很难就什么可以实际提高分数提出建议^*。例如，我们是否应该继续寻求替代搜索排序策略（例如：Wheeler Ruml listed here 的许多论文）？或者我们是否应该尝试将某种形式的局部改进启发式方法结合到 CKK 找到的解决方案中以帮助修剪？或者我们应该完全放弃基于 CKK 的方法并尝试动态规划方法？ PTAS 怎么样？在不了解 SPOJ 问题中使用的实例的具体形状的更多信息的情况下，很难猜测哪种方法会产生最大收益。每个都有其优点和缺点，具体取决于给定实例的特定属性。

^* 除了简单地运行同样的事情更快，比如说，通过在 C++ 中实现而不是 Python。

参考资料

[Cerq12] Cerquides、Jesús 和 Pedro Meseguer。 "Speeding Up 2-way Number Partitioning."ECAI。 2012, 内政部:10.3233/978-1-61499-098-7-223

[HS74] Horowitz、Ellis 和 Sartaj Sahni。 “Computing partitions with applications to the knapsack problem.”ACM 杂志 (JACM) 21.2 (1974)：277-292。

[Korf88] Korf, Richard E. (1998), “A complete anytime algorithm for number partitioning", Artificial Intelligence 106 (2): 181–203, doi:10.1016/S0004-3702(98)00086-1,

[Korf96] Korf, Richard E.“Improved limited discrepancy search。” AAAI/IAAI，卷。 1. 1996.

[Mert99] Mertens, Stephan (1999)，一个完整的平衡数划分算法，arXiv:cs/9903011

集合划分比差分得到更好的结果

Better results in set partition than by differencing

algorithm

optimization

partition-problem

参考资料