这个算法真的有效吗?子集总和回溯算法

Does this algorithm actually work? Sum of subsets backtracking algorithm

我想知道这个回溯算法是否真的有效。

在教科书Foundations of Algorithms第5th版中定义如下:

Algorithm 5.4: The Backtracking Algorithm for the Sum-of-Subsets Problem

Problem: Given n positive integers (weights) and a positive integer W, determine all combinations of the integers that sum up to W.

Inputs: positvie integer n, sorted (nondecreasing order) array of positive integers w indexed from 1 to n, and a positive integer W.

Outputs: all combinations of the integers that sum to W.

void sum_of_subsets(index i, 
                    int weight, int total)  {
  if (promising(i))
     if (weight == W)
        cout << include[1] through include [i];
     else {
        include[i + 1] = "yes";               // Include w[i + 1].
        sum_of_subsets(i + 1, weight + w[i + 1], total - w[i + 1]);
        include[i + 1] = "no";                // Do not include w[i + 1].
        sum_of_subsets(i + 1, weight, total - w[i + 1]);
     }
}

bool promising (index i);  {
  return (weight + total >= W) && (weight == W || weight + w[i + 1] <= W);
}

Following our usual convention, n, w, W, and include are not inputs to our routines. If these variables were defined globally, the top-level call to sum_of_subsets would be as follows:

sum_of_subsets(0, 0, total);

在第 5 章的结尾,exercise 13 提问:

  1. Use the Backtracking algorithm for the Sum-of-Subsets problem (Algorithm 5.4) to find all combinations of the following numbers that sum to W = 52:

    w1 = 2     w2 = 10     w3 = 13     w4 = 17     w5 = 22     w6 = 42

我已经实现了这个精确的算法,考虑到从 1 开始的数组,但它不起作用...

 void sos(int i, int weight, int total) {
    int yes = 1;
    int no = 0;

    if (promising(i, weight, total)) {
        if (weight == W) {
            for (int j = 0; j < arraySize; j++) {
                std::cout << include[j] << " ";
            }
            std::cout << "\n";
        }
        else if(i < arraySize) {
            include[i+1] = yes;
            sos(i + 1, weight + w[i+1], total - w[i+1]);
            include[i+1] = no;
            sos(i + 1, weight, total - w[i+1]);
        }
    }
}


int promising(int i,  int weight, int total) {
    return (weight + total >= W) && (weight == W || weight + w[i+1] <= W);
}   

我认为问题出在这里:

sos(i + 1, weight, total - w[i+1]);
sum_of_subsets(i+1, weight, total-w[i+1]);

当你到达这条线时,你没有正确回溯。

是否有人能够识别此算法的问题或实际编写代码使其工作?

我个人觉得算法有问题。没有边界检查,它使用了很多全局变量,并且假定数组是从 1 开始索引的。我不认为你可以逐字复制它。实际实现是伪代码。在 C++ 中,数组总是从 0 开始。因此,当您尝试执行 include[i+1] 而您只检查 i < arraySize.

时,您可能会遇到问题

该算法还假定您有一个名为 total 的全局变量,它由函数 promising.

使用

我稍微修改了代码,将其放在 class 中,并稍微简化了它:

class Solution
{
private:
    vector<int> w;
    vector<int> include;

public:
    Solution(vector<int> weights) : w(std::move(weights)),
        include(w.size(), 0) {}

    void sos(int i, int weight, int total) {
        int yes = 1;
        int no = 0;
        int arraySize = include.size();

        if (weight == total) {
            for (int j = 0; j < arraySize; j++) {
                if (include[j]) {
                    std::cout << w[j] << " ";
                }
            }
            std::cout << "\n";
        }
        else if (i < arraySize)
        {
            include[i] = yes;
            //Include this weight
            sos(i + 1, weight + w[i], total);
            include[i] = no;
            //Exclude this weight
            sos(i + 1, weight, total);
        }
    }
};

int main()
{   
    Solution solution({ 2, 10, 13, 17, 22, 42 });
    solution.sos(0, 0, 52);
    //prints:    10 42
    //           13 17 22
}

除了您没有在输出循环中应用基于 1 的数组逻辑外,代码正确地实现了算法。变化:

for (int j = 0; j < arraySize; j++) {
    std::cout << include[j] << " ";
}

至:

for (int j = 0; j < arraySize; j++) {
    std::cout << include[j+1] << " ";
}

根据您组织代码的方式,确保在定义 sos 时定义 promising

repl.it 上查看 运行。输出:

0 1 0 0 0 1
0 0 1 1 1 0

该算法工作正常:sos 函数的第二个和第三个参数充当 window,其中 运行ning sum 应该保留,并且 promising 函数验证此 window。这个 window 之外的任何值要么太小(即使将所有剩余值都添加到它,它仍然会小于目标值),要么太大(已经超过 运行 目标).这两个约束在书中第5.4章开头有解释。

在每个索引处有两种可能的选择:包括总和中的值,或者不包括。 includes[i+1] 处的值表示此选择,并且都已尝试。当在这种递归尝试的深层匹配时,将输出所有这些选择(0 或 1)。否则,它们将被忽略并在第二次尝试中切换到相反的选择。

所以是的,正如其他人所指出的,您无意中发现了基于 1 的数组索引。

除此之外,我认为你应该向作者索要你为这本书支付的部分return,因为他的代码逻辑过于复杂。

避免 运行 出现边界问题的一个好方法是不使用 C++(期待对此大声笑的大量反对票)。

只有 3 个案例需要测试:

  • 候选值大于剩余值。 (失败)
  • 候选值正好是剩下的
  • 候选值小于剩余值

promising 函数试图表达这一点,然后在主函数 sos.

中再次重新测试该函数的结果

但它可能看起来像这样简单:

search :: [Int] -> Int -> [Int] -> [[Int]]
search (x1:xs) t path 
    | x1 > t = []
    | x1 == t = [x1 : path]
    | x1 < t = search xs (t-x1) (x1 : path) ++ search xs t path
search [] 0 path = [path]
search [] _ _ = []


items = [2, 10, 13, 17, 22, 42] :: [Int]
target = 52 :: Int

search items target []
-- [[42,10],[22,17,13]]

现在,在编写 C++ 代码时实现类似的安全网绝不是不可能的。但这需要决心和有意识地决定你愿意应对什么,不应对什么。而且您需要愿意多输入几行才能完成 Haskell 的 10 行所做的工作。

首先,我对原始 C++ 代码中索引和范围检查的所有复杂性感到困扰。如果我们查看 Haskell 代码(与列表一起使用), 确认我们根本不需要随机访问。我们只会查看剩余项目的开始。然后我们将一个值附加到路径(在 Haskell 中,我们附加到前面是因为速度),最终我们将找到的组合附加到结果集中。考虑到这一点,担心指数有点过头了。

其次,我更喜欢搜索功能的外观 - 显示 3 个关键测试,周围没有任何噪音。我的 C++ 版本应该尽可能漂亮。

另外,全局变量是 1980 年的东西——我们不会有那个。将那些 "globals" 塞进 class 中以稍微隐藏它们是 1995 年的事了。我们也不会有那个。

它就在这里! "safer" C++ 实现。更漂亮……嗯……有些人可能不同意 ;)

#include <cstdint>
#include <vector>
#include <iostream>

using Items_t = std::vector<int32_t>;
using Result_t = std::vector<Items_t>;

// The C++ way of saying: deriving(Show)
template <class T>
std::ostream& operator <<(std::ostream& os, const std::vector<T>& value) 
{
    bool first = true;
    os << "[";
    for( const auto item : value) 
    {
        if(first) 
        {
            os << item;
            first = false;
        }
        else
        {
            os << "," << item;
        }
    }
    os << "]";
    return os;
}

// So we can do easy switch statement instead of chain of ifs.
enum class Comp : int8_t 
{   LT = -1
,   EQ = 0
,   GT = 1
};

static inline 
auto compI32( int32_t left, int32_t right ) -> Comp
{
    if(left == right) return Comp::EQ;
    if(left < right) return Comp::LT;
    return Comp::GT;
}

// So we can avoid index insanity and out of bounds problems.
template <class T>
struct VecRange
{
    using Iter_t = typename std::vector<T>::const_iterator;
    Iter_t current;
    Iter_t end;
    VecRange(const std::vector<T>& v)
        : current{v.cbegin()}
        , end{v.cend()}
    {}
    VecRange(Iter_t cur, Iter_t fin)
        : current{cur}
        , end{fin}
    {}
    static bool exhausted (const VecRange<T>&);
    static VecRange<T> next(const VecRange<T>&);
};

template <class T>
bool VecRange<T>::exhausted(const VecRange<T>& range)
{
    return range.current == range.end;
}

template <class T>
VecRange<T> VecRange<T>::next(const VecRange<T>& range)
{
    if(range.current != range.end)
        return VecRange<T>( range.current + 1, range.end );
    return range;   
}

using ItemsRange = VecRange<Items_t::value_type>;


static void search( const ItemsRange items, int32_t target, Items_t path, Result_t& result)
{
    if(ItemsRange::exhausted(items))
    {
        if(0 == target)
        {
            result.push_back(path);
        }
        return;
    }

    switch(compI32(*items.current,target))
    {
        case Comp::GT: 
            return;
        case Comp::EQ:
            {
                path.push_back(*items.current);
                result.push_back(path);
            }
            return;
        case Comp::LT:
            {
                auto path1 = path; // hope this makes a real copy...
                path1.push_back(*items.current);
                search(ItemsRange::next(items), target - *items.current, path1, result);
                search(ItemsRange::next(items), target, path, result);
            }
            return;
    }
}

int main(int argc, const char* argv[])
{
    Items_t items{ 2, 10, 13, 17, 22, 42 };
    Result_t result;
    int32_t target = 52;

    std::cout << "Input: "  << items << std::endl;
    std::cout << "Target: " << target << std::endl;
    search(ItemsRange{items}, target, Items_t{}, result);
    std::cout << "Output: " << result << std::endl;
    return 0;
}