如何有效地找到数组中三元组之和的最小差异?

How to efficiently find the minimal difference of sums of triplets in an array?

我有一个数字数组int arr[] = {4, 7, 8, 9, 2, 4, 7, 3, 5}; 我需要找到 3 个三胞胎(它们不需要连续),其中它们(三胞胎的)总和差最小('closest sums')。

说明: 每个数字可能只出现它在原始数组中出现的次数(即 {{4, 7, 8}, {9, 2, 4}, {7, **4**, 5}} 不正确,因为 4 在输入中只出现了两次。)

您可以假设数组已排序。 三胞胎 不必连续排列。

有什么想法吗?

答案并不简单。我们需要处理 "combinations"。请阅读here。结果,我们可以得到很大的数字,这使得计算变得困难。

一些基础知识。三元组由 3 个值组成。源数组有 9 个值。我们想要得到满足特定条件的三胞胎。

如果我们查看一个有 9 位数字的数字,我们可以通过计算具有 9 个值的数组的所有排列并始终采用索引 0,1,2 和 3,4,5 和 6,7 来获得所有可能的三元组,8.然后我们会自动得到所有的三胞胎。但也有许多双胞胎和不明显的不需要的三胞胎。总共 362880 个排列。现在的电脑也可行,没问题。

我们换一种方式,我们会计算出真正的组合,然后 9 选择 3 = 84。

发表了很多算法,如何生成所有组合,都是基于相同的原理。我们将创建一个布尔数组,其中包含 k = 3 个为真值。然后我们构建这个布尔数组的所有排列。每个排列总是有 3 个布尔值是真的。示例:

000000111
000001011
000001101
. . .

所以,很容易理解。

对于布尔数组的所有这些排列,我们检查值在哪个位置为真,select 具有相同索引的源值。那么我们就保证了一个三胞胎。我们将得到所有的三胞胎。对于

{ 4, 7, 8, 9, 2, 4, 7, 3, 5 }
-->
000000111 --> 7 3 5
000001011 --> 4 3 5
000001101 --> 4 7 5
. . .

这是一般机制。现在,接下来,我们应该从找到的 84 个三元组中 select 3 个不同的三元组。

不同的意思,没有位置用double。所以原始数组中的所有位置都必须存在。我们可以通过将所有值与 2 个三元组的所有其他值进行比较来检查区别。并且,与 3 个三胞胎相似。

接下来,我们需要 select 已经找到的 84 个三胞胎中的所有由 3 个三胞胎组成的组。这又是一个组合。因此,84 选择 3 = 95284 个可能的组。但是,如前所述,一组中的 3 个三元组必须不同。

如果我们检查这个,那么只剩下 280 个组供进一步评估。

(9 choose 3) * (6 choose 3) / 3! = 84 * 20 / 6 = 280

首先,我们select一个三胞胎。还剩6个号码。然后我们从剩下的6个中select取3个值,那么还剩下3个值。但是,对于左边的三元组,我们有所有的排列,所以,去掉排列并除以 3! = 6.

因为我们要找到特殊的组,它们的和需要满足一定的条件,所以我们计算组中所有三元组的和。

对于距离我们看总和。例如:如果我们有一个包含三元组和总和的组:

2 3 5-->10    7 4 7-->18    4 8 9-->21

10-------18---21

Distance 1: 8    18-10
Distance 2: 3    21-18
Dinstance overall=: 21-10 = 11       

所以,我们简单地计算MaxSum - MinSum 并称这个距离。我们对所有三胞胎组都这样做。

然后我们对结果进行排序,最小距离将在结果数据的开头。我们将得到例如:

4 7 5-->16    7 8 2-->17    4 9 3-->16
Distance: 1

另请注意。为了不与实数混淆,我们尽可能长时间地使用源数组中的索引进行计算。对于大多数计算,源数组数据是完全不相关的。只是为了计算总和,我们需要它们。

请参阅下面完整且注释良好的示例代码:

#include <iostream>
#include <algorithm>
#include <set>
#include <iterator>
#include <array>
#include <iomanip>

using Triplet = std::array<int, 3>;

// Constexpr function to calculate the factorial
constexpr unsigned long fact(unsigned int n) {
    if (n == 0) return 1; else return n * fact(n - 1);
};
// Constexpr function to calculate n choose k, and here specifically n choose 3
constexpr unsigned long NCR3(int n) {
    return fact(n) / ( 6 * fact(n - 3));
};

int main() {

    // The source data
    int arr[] = { 4, 7, 8, 9, 2, 4, 7, 3, 5 };

    // Get the number of source data
    constexpr size_t NumberOfSourceValues = sizeof(arr) / sizeof(arr[0]);

    // For calculating the combinations, we build a bool array with 3 bools set to true
    // and the rund all permutations for these 3 bools. So we will get all combinations
    // of a bool array with 3 true values
    std::array<bool, NumberOfSourceValues> selectors1{};
    static_assert(NumberOfSourceValues > 3, "\nError: Not enough source Values\n");
    selectors1[NumberOfSourceValues - 1] = true;
    selectors1[NumberOfSourceValues - 2] = true;
    selectors1[NumberOfSourceValues - 3] = true;

    // This is the result of 9 choose 3. It is 84. We will get that number of combinations
    constexpr size_t NumberOfTriplets = NCR3(NumberOfSourceValues);

    // Here we will store the 84 (9 choose 3) resulting combinations
    static std::array<Triplet, NumberOfTriplets> triplets{}; // Static --> Put on heap

    // Counter and index for storing the result
    size_t permutationCounter{};
    do {
        Triplet triplet{};  // Temporary triplet
        size_t indexInTriplet{};
        // Go through all bool values in our bool array
        for (size_t indexInBoolArray{}; indexInBoolArray < NumberOfSourceValues; ++indexInBoolArray)

            // If a bool is set in the bool array, then copy the INDEX of our indicesArray
            if (selectors1[indexInBoolArray]) triplet[indexInTriplet++] = indexInBoolArray;;

        // So, we found 3 indices (Guaranteed, because 3 bools are true always). Copy that to our result
        triplets[permutationCounter++] = std::move(triplet);

        // Calculate the next permutation
    } while (std::next_permutation(selectors1.begin(), selectors1.end()));

    // Array for goups of 3 triplets that are distinct (have no already used number)
    constexpr size_t NumberOfTripletGoups = NCR3(9) * NCR3(6) / 6;  // 6 = fact(3)
    static std::array<std::array<Triplet, 3>, NumberOfTripletGoups> distinctTripletGroups{}; // Static --> Put on heap

    // We want to again calculate combinations, n chooes k
    // So, we will have an array of 84 bools with the last 3 values true
    static std::array<bool, NumberOfTriplets> selectors2{};
    static_assert(NumberOfTriplets > 3, "\nError: Not enough triplets found\n");
    selectors2[NumberOfTriplets - 1] = true;
    selectors2[NumberOfTriplets - 2] = true;
    selectors2[NumberOfTriplets - 3] = true;

    // Please note: this loop will run 84 choose 3: 95384 times
    // But we will only use 280 distinct values
    size_t groupCounter{};
    do {
        std::array<Triplet, 3> tripletGroup{};
        size_t k{};
        for (size_t i{}; i < NumberOfTriplets; ++i) {
            if (selectors2[i]) {
                tripletGroup[k++] = triplets[i];
            }
        }
        // Check for distinct (not double) values in all 3 triplets of a triplet group.
        // Compare every value with all other values
        bool distinct = true;
        for (size_t ii{}; distinct && (ii < 3); ++ii) for (size_t kk{}; distinct && (kk < 3); ++kk) {
            distinct = distinct && (tripletGroup[0][ii] != tripletGroup[1][kk]);
            distinct = distinct && (tripletGroup[0][ii] != tripletGroup[2][kk]);
            distinct = distinct && (tripletGroup[1][ii] != tripletGroup[2][kk]);
        }
        // If the triplets are distinct, then we add the triplet group to the result
        if (distinct) {
            distinctTripletGroups[groupCounter++] = (std::move(tripletGroup));
        }
        // Next triplet goup selection
    } while (std::next_permutation(selectors2.begin(), selectors2.end()));


    // Holding the result of our distance calculations
    struct DistanceData {
        size_t threeTripletsIndex{};        // The index of the triplet group. Is in the begiining 0,1,2,3,4,5
        int distanceForThreeTriplets{};     // Distance of Triplets in triplet group
        std::array<int, 3> tripletSums{};   // Sums of single triplets in a group
    };
    static std::array<DistanceData, NumberOfTripletGoups> distanceData{}; // Static --> Put on heap

    // Calculate the distance data. Simply subtract the min sum of a triplet from the max sum of a triplet for one triplet-group
    for (size_t tripletGroup{}; tripletGroup < distinctTripletGroups.size(); ++tripletGroup) {
        for (size_t tripletIndex{}; tripletIndex < 3; ++tripletIndex)
            for (size_t indexInTriplet{}; indexInTriplet < 3; ++indexInTriplet)
                // Calculate the sum of one single triplet
                distanceData[tripletGroup].tripletSums[tripletIndex] += arr[distinctTripletGroups[tripletGroup][tripletIndex][indexInTriplet]];

        // Calculate the distannce for the three triplets
        distanceData[tripletGroup].distanceForThreeTriplets = std::max(std::max(distanceData[tripletGroup].tripletSums[0], distanceData[tripletGroup].tripletSums[1]), distanceData[tripletGroup].tripletSums[2]) -
            std::min(std::min(distanceData[tripletGroup].tripletSums[0], distanceData[tripletGroup].tripletSums[1]), distanceData[tripletGroup].tripletSums[2]);
        // And the index (Just neded for sorting later). Is alwyas equal to running loop variable
        distanceData[tripletGroup].threeTripletsIndex = tripletGroup;
    }

    // Sort result with distances, sum, and three-triplet index
    std::sort(distanceData.begin(), distanceData.end(), [](const DistanceData& d1, const DistanceData& d2) { return d1.distanceForThreeTriplets < d2.distanceForThreeTriplets; });

    // Show pretty printed output to user
    for (size_t tripletGroup{}; tripletGroup < distinctTripletGroups.size(); ++tripletGroup) {

        // Show the distance for 3 found triplets
        std::cout << std::right << std::setw(5) << tripletGroup + 1 << ".  Distance: " << std::setw(2) << distanceData[tripletGroup].distanceForThreeTriplets << '\t';

        // For each triplet in the set of 3 triplets
        for (size_t tripletIndex{}; tripletIndex < 3; ++tripletIndex) {

            // For each value of one single triplet
            for (size_t indexInTriplet{}; indexInTriplet < 3; ++indexInTriplet) {
                std::cout << arr[distinctTripletGroups[distanceData[tripletGroup].threeTripletsIndex][tripletIndex][indexInTriplet]] << " ";
            }
            // Show the sum of 1 triplet
            std::cout << "[" << std::setw(2) << distanceData[tripletGroup].tripletSums[tripletIndex] << "]\t";
        }
        std::cout << "\n";
    }
    return 0;
}

所有数组大小都可以是编译时常量。

不需要动态内存处理。