获取集合 A 的 r 长组合的快速方法,该组合至少具有集合 B 中的一个元素,集合 B 是 A 的子集

Fast way of getting r-long combinations of set A that have at least one element from set B, which is a subset of A

例如,如果 A={0,1,2,3,4}r=3B={1,4},结果将是:

[0, 1, 2]
[0, 1, 3]
[0, 1, 4]
[0, 2, 4]
[0, 3, 4]
[1, 2, 3]
[1, 2, 4]
[1, 3, 4]
[2, 3, 4]

这是 A 的所有 r 长组合,不包括 [0, 2, 3],因为那个不包含 1 或 4。

我目前的解决方案如下,使用我所知道的最快的算法来获得正常的组合,然后简单地检查一下生成的组合是否也包含 B 的元素 (java ):

    int[] A = new int[]{0,1,2,3,4};
    int[] B = new int[]{1,4};
    int n = A.length;
    int r = 3;

    int[] picks = new int[r]; //Holds indexes of elements in A
    for (int i = 0; i < picks.length; i++)
        picks[i] = i;

    int lastindex = picks.length - 1;

    outer:
    while (true) {
        int at = lastindex;
        while (true) {
            picks[at] += 1;
            if (picks[at] < n) {
                int displacement = picks[at] - at; // at + displacement = picks[at], at + displacement + 1 = picks[at] + 1 ,...

                // Make all picks elements after at y = picks[at] + x, so picks={0, 2, 4, 6, 18, 30} & at=3 --> picks={0, 2, 4, 5, 6, 7}
                // (Note that this example will never take place in reality, because the 18 or the 30 would be increased instead, depending on what n is)

                // Do the last one first, because that one is going to be the biggest,
                picks[lastindex] = lastindex + displacement;
                if (picks[lastindex] < n) { // and check if it doesn't overflow
                    for (int i = at + 1; i < lastindex; i++)
                        picks[i] = i + displacement;

                    int[] combination = new int[r];
                    for (int i = 0; i < r; i++)
                        combination[i] = A[picks[i]];
                    System.out.print(Arrays.toString(combination));
                    //^With this, all r-long combinations of A get printed

                    //Straightforward, bruteforce-ish way of checking if int[] combination
                    //contains any element from B
                    presence:
                    for (int p : combination) {
                        for (int b : B) {
                            if (p==b) {
                                System.out.print(" <-- Also contains an element from B");
                                break presence;
                            }
                        }
                    }

                    System.out.println();

                    break;
                }
            }
            at--;
            if (at < 0) {
                //Moving this check to the start of the while loop will make this natively support pick 0 cases (5C0 for example),
                //but reduce performance by I believe quite a bit. Probably better to special-case those (I haven't
                // done that in this test tho)
                break outer;
            }
        }
    }

输出:

[0, 1, 3] <-- Also contains an element from B
[0, 1, 4] <-- Also contains an element from B
[0, 2, 3]
[0, 2, 4] <-- Also contains an element from B
[0, 3, 4] <-- Also contains an element from B
[1, 2, 3] <-- Also contains an element from B
[1, 2, 4] <-- Also contains an element from B
[1, 3, 4] <-- Also contains an element from B
[2, 3, 4] <-- Also contains an element from B

正如评论中所写,我认为这种方法非常简陋。谁能想到一种更快的方法来做到这一点?

假设您有一个 int[][] FindCombinations(int[] set, int length) 函数,该函数 returns 是 set 的所有 length 长组合的列表,请执行以下操作(伪代码):

for i=1 to B.length
{
  int bi = B[i];

  A = A - bi; // remove bi from A
  foreach C in FindCombinations(A, r-1)
  {
     output C+bi // output the union of C and {bi}
  }
}

这样一来,所有组合都至少包含一个来自 B 的元素(并且还可能包含 B 中尚未使用的元素),无需太多额外工作。所有其他组合都被免费消除(根本不必找到),并且组合包含每个组合的 B 元素的测试也被消除。

此算法是否更快,在很大程度上取决于您从集合中 add/remove 元素的效率以及包含与排除组合的百分比(即,如果您最终只排除了总组合的 1%可能不值得)

请注意,当获得与 {b[i]} 并集的组合时,这些组合可能还包含一个元素 B[j],其中 j>i。当你达到与 B[j] 联合的组合时 none 将包含 B[i],因此所有组合都是唯一的。