获取集合 A 的 r 长组合的快速方法,该组合至少具有集合 B 中的一个元素,集合 B 是 A 的子集
Fast way of getting r-long combinations of set A that have at least one element from set B, which is a subset of A
例如,如果 A={0,1,2,3,4}
、r=3
和 B={1,4}
,结果将是:
[0, 1, 2]
[0, 1, 3]
[0, 1, 4]
[0, 2, 4]
[0, 3, 4]
[1, 2, 3]
[1, 2, 4]
[1, 3, 4]
[2, 3, 4]
这是 A 的所有 r 长组合,不包括 [0, 2, 3]
,因为那个不包含 1 或 4。
我目前的解决方案如下,使用我所知道的最快的算法来获得正常的组合,然后简单地检查一下生成的组合是否也包含 B 的元素 (java ):
int[] A = new int[]{0,1,2,3,4};
int[] B = new int[]{1,4};
int n = A.length;
int r = 3;
int[] picks = new int[r]; //Holds indexes of elements in A
for (int i = 0; i < picks.length; i++)
picks[i] = i;
int lastindex = picks.length - 1;
outer:
while (true) {
int at = lastindex;
while (true) {
picks[at] += 1;
if (picks[at] < n) {
int displacement = picks[at] - at; // at + displacement = picks[at], at + displacement + 1 = picks[at] + 1 ,...
// Make all picks elements after at y = picks[at] + x, so picks={0, 2, 4, 6, 18, 30} & at=3 --> picks={0, 2, 4, 5, 6, 7}
// (Note that this example will never take place in reality, because the 18 or the 30 would be increased instead, depending on what n is)
// Do the last one first, because that one is going to be the biggest,
picks[lastindex] = lastindex + displacement;
if (picks[lastindex] < n) { // and check if it doesn't overflow
for (int i = at + 1; i < lastindex; i++)
picks[i] = i + displacement;
int[] combination = new int[r];
for (int i = 0; i < r; i++)
combination[i] = A[picks[i]];
System.out.print(Arrays.toString(combination));
//^With this, all r-long combinations of A get printed
//Straightforward, bruteforce-ish way of checking if int[] combination
//contains any element from B
presence:
for (int p : combination) {
for (int b : B) {
if (p==b) {
System.out.print(" <-- Also contains an element from B");
break presence;
}
}
}
System.out.println();
break;
}
}
at--;
if (at < 0) {
//Moving this check to the start of the while loop will make this natively support pick 0 cases (5C0 for example),
//but reduce performance by I believe quite a bit. Probably better to special-case those (I haven't
// done that in this test tho)
break outer;
}
}
}
输出:
[0, 1, 3] <-- Also contains an element from B
[0, 1, 4] <-- Also contains an element from B
[0, 2, 3]
[0, 2, 4] <-- Also contains an element from B
[0, 3, 4] <-- Also contains an element from B
[1, 2, 3] <-- Also contains an element from B
[1, 2, 4] <-- Also contains an element from B
[1, 3, 4] <-- Also contains an element from B
[2, 3, 4] <-- Also contains an element from B
正如评论中所写,我认为这种方法非常简陋。谁能想到一种更快的方法来做到这一点?
假设您有一个 int[][] FindCombinations(int[] set, int length)
函数,该函数 returns 是 set
的所有 length
长组合的列表,请执行以下操作(伪代码):
for i=1 to B.length
{
int bi = B[i];
A = A - bi; // remove bi from A
foreach C in FindCombinations(A, r-1)
{
output C+bi // output the union of C and {bi}
}
}
这样一来,所有组合都至少包含一个来自 B 的元素(并且还可能包含 B 中尚未使用的元素),无需太多额外工作。所有其他组合都被免费消除(根本不必找到),并且组合包含每个组合的 B 元素的测试也被消除。
此算法是否更快,在很大程度上取决于您从集合中 add/remove 元素的效率以及包含与排除组合的百分比(即,如果您最终只排除了总组合的 1%可能不值得)
请注意,当获得与 {b[i]} 并集的组合时,这些组合可能还包含一个元素 B[j],其中 j>i。当你达到与 B[j] 联合的组合时 none 将包含 B[i],因此所有组合都是唯一的。
例如,如果 A={0,1,2,3,4}
、r=3
和 B={1,4}
,结果将是:
[0, 1, 2]
[0, 1, 3]
[0, 1, 4]
[0, 2, 4]
[0, 3, 4]
[1, 2, 3]
[1, 2, 4]
[1, 3, 4]
[2, 3, 4]
这是 A 的所有 r 长组合,不包括 [0, 2, 3]
,因为那个不包含 1 或 4。
我目前的解决方案如下,使用我所知道的最快的算法来获得正常的组合,然后简单地检查一下生成的组合是否也包含 B 的元素 (java ):
int[] A = new int[]{0,1,2,3,4};
int[] B = new int[]{1,4};
int n = A.length;
int r = 3;
int[] picks = new int[r]; //Holds indexes of elements in A
for (int i = 0; i < picks.length; i++)
picks[i] = i;
int lastindex = picks.length - 1;
outer:
while (true) {
int at = lastindex;
while (true) {
picks[at] += 1;
if (picks[at] < n) {
int displacement = picks[at] - at; // at + displacement = picks[at], at + displacement + 1 = picks[at] + 1 ,...
// Make all picks elements after at y = picks[at] + x, so picks={0, 2, 4, 6, 18, 30} & at=3 --> picks={0, 2, 4, 5, 6, 7}
// (Note that this example will never take place in reality, because the 18 or the 30 would be increased instead, depending on what n is)
// Do the last one first, because that one is going to be the biggest,
picks[lastindex] = lastindex + displacement;
if (picks[lastindex] < n) { // and check if it doesn't overflow
for (int i = at + 1; i < lastindex; i++)
picks[i] = i + displacement;
int[] combination = new int[r];
for (int i = 0; i < r; i++)
combination[i] = A[picks[i]];
System.out.print(Arrays.toString(combination));
//^With this, all r-long combinations of A get printed
//Straightforward, bruteforce-ish way of checking if int[] combination
//contains any element from B
presence:
for (int p : combination) {
for (int b : B) {
if (p==b) {
System.out.print(" <-- Also contains an element from B");
break presence;
}
}
}
System.out.println();
break;
}
}
at--;
if (at < 0) {
//Moving this check to the start of the while loop will make this natively support pick 0 cases (5C0 for example),
//but reduce performance by I believe quite a bit. Probably better to special-case those (I haven't
// done that in this test tho)
break outer;
}
}
}
输出:
[0, 1, 3] <-- Also contains an element from B
[0, 1, 4] <-- Also contains an element from B
[0, 2, 3]
[0, 2, 4] <-- Also contains an element from B
[0, 3, 4] <-- Also contains an element from B
[1, 2, 3] <-- Also contains an element from B
[1, 2, 4] <-- Also contains an element from B
[1, 3, 4] <-- Also contains an element from B
[2, 3, 4] <-- Also contains an element from B
正如评论中所写,我认为这种方法非常简陋。谁能想到一种更快的方法来做到这一点?
假设您有一个 int[][] FindCombinations(int[] set, int length)
函数,该函数 returns 是 set
的所有 length
长组合的列表,请执行以下操作(伪代码):
for i=1 to B.length
{
int bi = B[i];
A = A - bi; // remove bi from A
foreach C in FindCombinations(A, r-1)
{
output C+bi // output the union of C and {bi}
}
}
这样一来,所有组合都至少包含一个来自 B 的元素(并且还可能包含 B 中尚未使用的元素),无需太多额外工作。所有其他组合都被免费消除(根本不必找到),并且组合包含每个组合的 B 元素的测试也被消除。
此算法是否更快,在很大程度上取决于您从集合中 add/remove 元素的效率以及包含与排除组合的百分比(即,如果您最终只排除了总组合的 1%可能不值得)
请注意,当获得与 {b[i]} 并集的组合时,这些组合可能还包含一个元素 B[j],其中 j>i。当你达到与 B[j] 联合的组合时 none 将包含 B[i],因此所有组合都是唯一的。