这个算法真的有效吗?子集总和回溯算法
Does this algorithm actually work? Sum of subsets backtracking algorithm
我想知道这个回溯算法是否真的有效。
在教科书Foundations of Algorithms第5th版中定义如下:
Algorithm 5.4: The Backtracking Algorithm for the Sum-of-Subsets Problem
Problem: Given n positive integers (weights) and a positive integer W,
determine all combinations of the integers that sum up to W.
Inputs: positvie integer n, sorted (nondecreasing order) array of
positive integers w indexed from 1 to n, and a positive integer
W.
Outputs: all combinations of the integers that sum to W.
void sum_of_subsets(index i,
int weight, int total) {
if (promising(i))
if (weight == W)
cout << include[1] through include [i];
else {
include[i + 1] = "yes"; // Include w[i + 1].
sum_of_subsets(i + 1, weight + w[i + 1], total - w[i + 1]);
include[i + 1] = "no"; // Do not include w[i + 1].
sum_of_subsets(i + 1, weight, total - w[i + 1]);
}
}
bool promising (index i); {
return (weight + total >= W) && (weight == W || weight + w[i + 1] <= W);
}
Following our usual convention, n, w, W, and include are not
inputs to our routines. If these variables were defined globally, the
top-level call to sum_of_subsets would be as follows:
sum_of_subsets(0, 0, total);
在第 5 章的结尾,exercise 13 提问:
Use the Backtracking algorithm for the Sum-of-Subsets problem (Algorithm 5.4)
to find all combinations of the following numbers that sum to W = 52:
w1 = 2 w2 = 10 w3 = 13 w4 = 17 w5 = 22 w6 = 42
我已经实现了这个精确的算法,考虑到从 1 开始的数组,但它不起作用...
void sos(int i, int weight, int total) {
int yes = 1;
int no = 0;
if (promising(i, weight, total)) {
if (weight == W) {
for (int j = 0; j < arraySize; j++) {
std::cout << include[j] << " ";
}
std::cout << "\n";
}
else if(i < arraySize) {
include[i+1] = yes;
sos(i + 1, weight + w[i+1], total - w[i+1]);
include[i+1] = no;
sos(i + 1, weight, total - w[i+1]);
}
}
}
int promising(int i, int weight, int total) {
return (weight + total >= W) && (weight == W || weight + w[i+1] <= W);
}
我认为问题出在这里:
sos(i + 1, weight, total - w[i+1]);
sum_of_subsets(i+1, weight, total-w[i+1]);
当你到达这条线时,你没有正确回溯。
是否有人能够识别此算法的问题或实际编写代码使其工作?
我个人觉得算法有问题。没有边界检查,它使用了很多全局变量,并且假定数组是从 1 开始索引的。我不认为你可以逐字复制它。实际实现是伪代码。在 C++ 中,数组总是从 0 开始。因此,当您尝试执行 include[i+1]
而您只检查 i < arraySize
.
时,您可能会遇到问题
该算法还假定您有一个名为 total
的全局变量,它由函数 promising
.
使用
我稍微修改了代码,将其放在 class 中,并稍微简化了它:
class Solution
{
private:
vector<int> w;
vector<int> include;
public:
Solution(vector<int> weights) : w(std::move(weights)),
include(w.size(), 0) {}
void sos(int i, int weight, int total) {
int yes = 1;
int no = 0;
int arraySize = include.size();
if (weight == total) {
for (int j = 0; j < arraySize; j++) {
if (include[j]) {
std::cout << w[j] << " ";
}
}
std::cout << "\n";
}
else if (i < arraySize)
{
include[i] = yes;
//Include this weight
sos(i + 1, weight + w[i], total);
include[i] = no;
//Exclude this weight
sos(i + 1, weight, total);
}
}
};
int main()
{
Solution solution({ 2, 10, 13, 17, 22, 42 });
solution.sos(0, 0, 52);
//prints: 10 42
// 13 17 22
}
除了您没有在输出循环中应用基于 1 的数组逻辑外,代码正确地实现了算法。变化:
for (int j = 0; j < arraySize; j++) {
std::cout << include[j] << " ";
}
至:
for (int j = 0; j < arraySize; j++) {
std::cout << include[j+1] << " ";
}
根据您组织代码的方式,确保在定义 sos
时定义 promising
。
在 repl.it 上查看 运行。输出:
0 1 0 0 0 1
0 0 1 1 1 0
该算法工作正常:sos
函数的第二个和第三个参数充当 window,其中 运行ning sum 应该保留,并且 promising
函数验证此 window。这个 window 之外的任何值要么太小(即使将所有剩余值都添加到它,它仍然会小于目标值),要么太大(已经超过 运行 目标).这两个约束在书中第5.4章开头有解释。
在每个索引处有两种可能的选择:包括总和中的值,或者不包括。 includes[i+1]
处的值表示此选择,并且都已尝试。当在这种递归尝试的深层匹配时,将输出所有这些选择(0 或 1)。否则,它们将被忽略并在第二次尝试中切换到相反的选择。
所以是的,正如其他人所指出的,您无意中发现了基于 1 的数组索引。
除此之外,我认为你应该向作者索要你为这本书支付的部分return,因为他的代码逻辑过于复杂。
避免 运行 出现边界问题的一个好方法是不使用 C++(期待对此大声笑的大量反对票)。
只有 3 个案例需要测试:
- 候选值大于剩余值。 (失败)
- 候选值正好是剩下的
- 候选值小于剩余值
promising
函数试图表达这一点,然后在主函数 sos
.
中再次重新测试该函数的结果
但它可能看起来像这样简单:
search :: [Int] -> Int -> [Int] -> [[Int]]
search (x1:xs) t path
| x1 > t = []
| x1 == t = [x1 : path]
| x1 < t = search xs (t-x1) (x1 : path) ++ search xs t path
search [] 0 path = [path]
search [] _ _ = []
items = [2, 10, 13, 17, 22, 42] :: [Int]
target = 52 :: Int
search items target []
-- [[42,10],[22,17,13]]
现在,在编写 C++ 代码时实现类似的安全网绝不是不可能的。但这需要决心和有意识地决定你愿意应对什么,不应对什么。而且您需要愿意多输入几行才能完成 Haskell 的 10 行所做的工作。
首先,我对原始 C++ 代码中索引和范围检查的所有复杂性感到困扰。如果我们查看 Haskell 代码(与列表一起使用),
确认我们根本不需要随机访问。我们只会查看剩余项目的开始。然后我们将一个值附加到路径(在 Haskell 中,我们附加到前面是因为速度),最终我们将找到的组合附加到结果集中。考虑到这一点,担心指数有点过头了。
其次,我更喜欢搜索功能的外观 - 显示 3 个关键测试,周围没有任何噪音。我的 C++ 版本应该尽可能漂亮。
另外,全局变量是 1980 年的东西——我们不会有那个。将那些 "globals" 塞进 class 中以稍微隐藏它们是 1995 年的事了。我们也不会有那个。
它就在这里! "safer" C++ 实现。更漂亮……嗯……有些人可能不同意 ;)
#include <cstdint>
#include <vector>
#include <iostream>
using Items_t = std::vector<int32_t>;
using Result_t = std::vector<Items_t>;
// The C++ way of saying: deriving(Show)
template <class T>
std::ostream& operator <<(std::ostream& os, const std::vector<T>& value)
{
bool first = true;
os << "[";
for( const auto item : value)
{
if(first)
{
os << item;
first = false;
}
else
{
os << "," << item;
}
}
os << "]";
return os;
}
// So we can do easy switch statement instead of chain of ifs.
enum class Comp : int8_t
{ LT = -1
, EQ = 0
, GT = 1
};
static inline
auto compI32( int32_t left, int32_t right ) -> Comp
{
if(left == right) return Comp::EQ;
if(left < right) return Comp::LT;
return Comp::GT;
}
// So we can avoid index insanity and out of bounds problems.
template <class T>
struct VecRange
{
using Iter_t = typename std::vector<T>::const_iterator;
Iter_t current;
Iter_t end;
VecRange(const std::vector<T>& v)
: current{v.cbegin()}
, end{v.cend()}
{}
VecRange(Iter_t cur, Iter_t fin)
: current{cur}
, end{fin}
{}
static bool exhausted (const VecRange<T>&);
static VecRange<T> next(const VecRange<T>&);
};
template <class T>
bool VecRange<T>::exhausted(const VecRange<T>& range)
{
return range.current == range.end;
}
template <class T>
VecRange<T> VecRange<T>::next(const VecRange<T>& range)
{
if(range.current != range.end)
return VecRange<T>( range.current + 1, range.end );
return range;
}
using ItemsRange = VecRange<Items_t::value_type>;
static void search( const ItemsRange items, int32_t target, Items_t path, Result_t& result)
{
if(ItemsRange::exhausted(items))
{
if(0 == target)
{
result.push_back(path);
}
return;
}
switch(compI32(*items.current,target))
{
case Comp::GT:
return;
case Comp::EQ:
{
path.push_back(*items.current);
result.push_back(path);
}
return;
case Comp::LT:
{
auto path1 = path; // hope this makes a real copy...
path1.push_back(*items.current);
search(ItemsRange::next(items), target - *items.current, path1, result);
search(ItemsRange::next(items), target, path, result);
}
return;
}
}
int main(int argc, const char* argv[])
{
Items_t items{ 2, 10, 13, 17, 22, 42 };
Result_t result;
int32_t target = 52;
std::cout << "Input: " << items << std::endl;
std::cout << "Target: " << target << std::endl;
search(ItemsRange{items}, target, Items_t{}, result);
std::cout << "Output: " << result << std::endl;
return 0;
}
我想知道这个回溯算法是否真的有效。
在教科书Foundations of Algorithms第5th版中定义如下:
Algorithm 5.4: The Backtracking Algorithm for the Sum-of-Subsets Problem
Problem: Given n positive integers (weights) and a positive integer W, determine all combinations of the integers that sum up to W.
Inputs: positvie integer n, sorted (nondecreasing order) array of positive integers w indexed from 1 to n, and a positive integer W.
Outputs: all combinations of the integers that sum to W.
void sum_of_subsets(index i, int weight, int total) { if (promising(i)) if (weight == W) cout << include[1] through include [i]; else { include[i + 1] = "yes"; // Include w[i + 1]. sum_of_subsets(i + 1, weight + w[i + 1], total - w[i + 1]); include[i + 1] = "no"; // Do not include w[i + 1]. sum_of_subsets(i + 1, weight, total - w[i + 1]); } } bool promising (index i); { return (weight + total >= W) && (weight == W || weight + w[i + 1] <= W); }
Following our usual convention, n, w, W, and include are not inputs to our routines. If these variables were defined globally, the top-level call to sum_of_subsets would be as follows:
sum_of_subsets(0, 0, total);
在第 5 章的结尾,exercise 13 提问:
Use the Backtracking algorithm for the Sum-of-Subsets problem (Algorithm 5.4) to find all combinations of the following numbers that sum to W = 52:
w1 = 2 w2 = 10 w3 = 13 w4 = 17 w5 = 22 w6 = 42
我已经实现了这个精确的算法,考虑到从 1 开始的数组,但它不起作用...
void sos(int i, int weight, int total) {
int yes = 1;
int no = 0;
if (promising(i, weight, total)) {
if (weight == W) {
for (int j = 0; j < arraySize; j++) {
std::cout << include[j] << " ";
}
std::cout << "\n";
}
else if(i < arraySize) {
include[i+1] = yes;
sos(i + 1, weight + w[i+1], total - w[i+1]);
include[i+1] = no;
sos(i + 1, weight, total - w[i+1]);
}
}
}
int promising(int i, int weight, int total) {
return (weight + total >= W) && (weight == W || weight + w[i+1] <= W);
}
我认为问题出在这里:
sos(i + 1, weight, total - w[i+1]);
sum_of_subsets(i+1, weight, total-w[i+1]);
当你到达这条线时,你没有正确回溯。
是否有人能够识别此算法的问题或实际编写代码使其工作?
我个人觉得算法有问题。没有边界检查,它使用了很多全局变量,并且假定数组是从 1 开始索引的。我不认为你可以逐字复制它。实际实现是伪代码。在 C++ 中,数组总是从 0 开始。因此,当您尝试执行 include[i+1]
而您只检查 i < arraySize
.
该算法还假定您有一个名为 total
的全局变量,它由函数 promising
.
我稍微修改了代码,将其放在 class 中,并稍微简化了它:
class Solution
{
private:
vector<int> w;
vector<int> include;
public:
Solution(vector<int> weights) : w(std::move(weights)),
include(w.size(), 0) {}
void sos(int i, int weight, int total) {
int yes = 1;
int no = 0;
int arraySize = include.size();
if (weight == total) {
for (int j = 0; j < arraySize; j++) {
if (include[j]) {
std::cout << w[j] << " ";
}
}
std::cout << "\n";
}
else if (i < arraySize)
{
include[i] = yes;
//Include this weight
sos(i + 1, weight + w[i], total);
include[i] = no;
//Exclude this weight
sos(i + 1, weight, total);
}
}
};
int main()
{
Solution solution({ 2, 10, 13, 17, 22, 42 });
solution.sos(0, 0, 52);
//prints: 10 42
// 13 17 22
}
除了您没有在输出循环中应用基于 1 的数组逻辑外,代码正确地实现了算法。变化:
for (int j = 0; j < arraySize; j++) {
std::cout << include[j] << " ";
}
至:
for (int j = 0; j < arraySize; j++) {
std::cout << include[j+1] << " ";
}
根据您组织代码的方式,确保在定义 sos
时定义 promising
。
在 repl.it 上查看 运行。输出:
0 1 0 0 0 1
0 0 1 1 1 0
该算法工作正常:sos
函数的第二个和第三个参数充当 window,其中 运行ning sum 应该保留,并且 promising
函数验证此 window。这个 window 之外的任何值要么太小(即使将所有剩余值都添加到它,它仍然会小于目标值),要么太大(已经超过 运行 目标).这两个约束在书中第5.4章开头有解释。
在每个索引处有两种可能的选择:包括总和中的值,或者不包括。 includes[i+1]
处的值表示此选择,并且都已尝试。当在这种递归尝试的深层匹配时,将输出所有这些选择(0 或 1)。否则,它们将被忽略并在第二次尝试中切换到相反的选择。
所以是的,正如其他人所指出的,您无意中发现了基于 1 的数组索引。
除此之外,我认为你应该向作者索要你为这本书支付的部分return,因为他的代码逻辑过于复杂。
避免 运行 出现边界问题的一个好方法是不使用 C++(期待对此大声笑的大量反对票)。
只有 3 个案例需要测试:
- 候选值大于剩余值。 (失败)
- 候选值正好是剩下的
- 候选值小于剩余值
promising
函数试图表达这一点,然后在主函数 sos
.
但它可能看起来像这样简单:
search :: [Int] -> Int -> [Int] -> [[Int]]
search (x1:xs) t path
| x1 > t = []
| x1 == t = [x1 : path]
| x1 < t = search xs (t-x1) (x1 : path) ++ search xs t path
search [] 0 path = [path]
search [] _ _ = []
items = [2, 10, 13, 17, 22, 42] :: [Int]
target = 52 :: Int
search items target []
-- [[42,10],[22,17,13]]
现在,在编写 C++ 代码时实现类似的安全网绝不是不可能的。但这需要决心和有意识地决定你愿意应对什么,不应对什么。而且您需要愿意多输入几行才能完成 Haskell 的 10 行所做的工作。
首先,我对原始 C++ 代码中索引和范围检查的所有复杂性感到困扰。如果我们查看 Haskell 代码(与列表一起使用), 确认我们根本不需要随机访问。我们只会查看剩余项目的开始。然后我们将一个值附加到路径(在 Haskell 中,我们附加到前面是因为速度),最终我们将找到的组合附加到结果集中。考虑到这一点,担心指数有点过头了。
其次,我更喜欢搜索功能的外观 - 显示 3 个关键测试,周围没有任何噪音。我的 C++ 版本应该尽可能漂亮。
另外,全局变量是 1980 年的东西——我们不会有那个。将那些 "globals" 塞进 class 中以稍微隐藏它们是 1995 年的事了。我们也不会有那个。
它就在这里! "safer" C++ 实现。更漂亮……嗯……有些人可能不同意 ;)
#include <cstdint>
#include <vector>
#include <iostream>
using Items_t = std::vector<int32_t>;
using Result_t = std::vector<Items_t>;
// The C++ way of saying: deriving(Show)
template <class T>
std::ostream& operator <<(std::ostream& os, const std::vector<T>& value)
{
bool first = true;
os << "[";
for( const auto item : value)
{
if(first)
{
os << item;
first = false;
}
else
{
os << "," << item;
}
}
os << "]";
return os;
}
// So we can do easy switch statement instead of chain of ifs.
enum class Comp : int8_t
{ LT = -1
, EQ = 0
, GT = 1
};
static inline
auto compI32( int32_t left, int32_t right ) -> Comp
{
if(left == right) return Comp::EQ;
if(left < right) return Comp::LT;
return Comp::GT;
}
// So we can avoid index insanity and out of bounds problems.
template <class T>
struct VecRange
{
using Iter_t = typename std::vector<T>::const_iterator;
Iter_t current;
Iter_t end;
VecRange(const std::vector<T>& v)
: current{v.cbegin()}
, end{v.cend()}
{}
VecRange(Iter_t cur, Iter_t fin)
: current{cur}
, end{fin}
{}
static bool exhausted (const VecRange<T>&);
static VecRange<T> next(const VecRange<T>&);
};
template <class T>
bool VecRange<T>::exhausted(const VecRange<T>& range)
{
return range.current == range.end;
}
template <class T>
VecRange<T> VecRange<T>::next(const VecRange<T>& range)
{
if(range.current != range.end)
return VecRange<T>( range.current + 1, range.end );
return range;
}
using ItemsRange = VecRange<Items_t::value_type>;
static void search( const ItemsRange items, int32_t target, Items_t path, Result_t& result)
{
if(ItemsRange::exhausted(items))
{
if(0 == target)
{
result.push_back(path);
}
return;
}
switch(compI32(*items.current,target))
{
case Comp::GT:
return;
case Comp::EQ:
{
path.push_back(*items.current);
result.push_back(path);
}
return;
case Comp::LT:
{
auto path1 = path; // hope this makes a real copy...
path1.push_back(*items.current);
search(ItemsRange::next(items), target - *items.current, path1, result);
search(ItemsRange::next(items), target, path, result);
}
return;
}
}
int main(int argc, const char* argv[])
{
Items_t items{ 2, 10, 13, 17, 22, 42 };
Result_t result;
int32_t target = 52;
std::cout << "Input: " << items << std::endl;
std::cout << "Target: " << target << std::endl;
search(ItemsRange{items}, target, Items_t{}, result);
std::cout << "Output: " << result << std::endl;
return 0;
}