使用可变参数模板和 lambda 函数进行二进制搜索

Binary search using variadic templates and lambda functions

考虑一下,

struct Person {
    std::string name;
    Person (const std::string& n) : name(n) {}
    std::string getName(int, char) const {return name;}  // int, char play no role in this
        // simple example, but let's suppose that they are needed.
} *Bob = new Person("Bob"), *Frank = new Person("Frank"), *Mark = new Person("Mark"),
    *Tom = new Person("Tom"), *Zack = new Person("Zack");

const std::vector<Person*> people = {Bob, Frank, Mark, Tom, Zack};

因为people是按名字排序的,我们可以进行二分查找,找到people中具有特定名字的元素。我希望这个调用看起来像

Person* person = binarySearch (people, "Tom",
   [](Person* p, int n, char c) {return p->getName(n,c);},
   [](const std::string& x, const std::string& y) {return x.compare(y) < 0;}, 5, 'a');

所以模板函数binarySearch可以通用。我得到它与以下工作:

#include <iostream>
#include <string>
#include <vector>
#include <functional>

struct Person {
    std::string name;
    Person (const std::string& n) : name(n) {}
    std::string getName(int, char) const {return name;}  // int, char play no role in this
        // simple example, but let's supposes that they are needed.
} *Bob = new Person("Bob"), *Frank = new Person("Frank"), *Mark = new Person("Mark"),
    *Tom = new Person("Tom"), *Zack = new Person("Zack");

const std::vector<Person*> people = {Bob, Frank, Mark, Tom, Zack};

template <typename Container, typename Ret>
typename Container::value_type binarySearch (const Container& container, const Ret& value,
        std::function<Ret(const typename Container::value_type&, int, char)> f,
        std::function<bool(const Ret&, const Ret&)> comp,
        typename Container::difference_type low, typename Container::difference_type high,
        int n, char c) {
    if (low > high)
        std::cout << "Error!  Not found!\n";
    const typename Container::difference_type mid = (low + high) / 2;
    const Ret& r = f(container[mid], n, c);
    if (r == value)
        return container[mid];
    if (comp(r, value))
        return binarySearch (container, value, f, comp, mid + 1, high, n, c);
    return binarySearch (container, value, f, comp, low, mid - 1, n, c);
}

template <typename Container, typename Ret>
typename Container::value_type binarySearch (const Container& container, const Ret& value,
        std::function<Ret(const typename Container::value_type&, int, char)> f,
        std::function<bool(const Ret&, const Ret&)> comp, int n, char c) {
    return binarySearch (container, value, f, comp, 0, container.size() - 1, n, c);
}

int main() {
    const Person* person = binarySearch<std::vector<Person*>, std::string>
        (people, "Tom", &Person::getName,
        [](const std::string& x, const std::string& y) {return x.compare(y) < 0;}, 5, 'a');
    std::cout << person->getName(5,'a') << '\n';  // Tom
}

但现在由于我不明白的原因,我无法将特定参数 int, char 替换为 Args...。您可以继续将 Args... argsargs... 放在上面代码中需要的地方,它不会编译。这里有什么问题?如何进行泛化的最后一步?还是应该改变整个方法?

这是我试过的:

template <typename Container, typename Ret, typename... Args>
typename Container::value_type binarySearch (const Container& container, const Ret& value,
        std::function<Ret(const typename Container::value_type&, Args...)> f,
        std::function<bool(const Ret&, const Ret&)> comp,
        typename Container::difference_type low, typename Container::difference_type high,
        Args... args) {
    if (low > high)
        std::cout << "Error!  Not found!\n";
    const typename Container::difference_type mid = (low + high) / 2;
    const Ret& r = f(container[mid], args...);
    if (r == value)
        return container[mid];
    if (comp(r, value))
        return binarySearch (container, value, f, comp, mid + 1, high, args...);
    return binarySearch (container, value, f, comp, low, mid - 1, args...);
}

template <typename Container, typename Ret, typename... Args>
typename Container::value_type binarySearch (const Container& container, const Ret& value,
        std::function<Ret(const typename Container::value_type&, Args...)> f,
        std::function<bool(const Ret&, const Ret&)> comp, Args... args) {
    return binarySearch (container, value, f, comp, 0, container.size() - 1, args...);
}

int main() {
    const Person* person = binarySearch<std::vector<Person*>, std::string> (people, "Tom",
            &Person::getName,
        [](const std::string& x, const std::string& y) {return x.compare(y) < 0;}, 5, 'a');
    std::cout << person->getName(5,'a') << '\n';
}

海湾合作委员会 4.9.2:

[Error] no matching function for call to 'binarySearch(std::vector<Person*>&, const char [4], main()::__lambda0, main()::__lambda1, int, char)'
template argument deduction/substitution failed:
[Note] 'main()::__lambda0' is not derived from 'std::function<std::basic_string<char>(Person* const&, Args ...)>'

更新: 在研究了 Yakk 的解决方案之后,我将我的解决方案调整为以下内容(使用更多的第一原则而不是 std::equal_range):

#include <iostream>
#include <iterator>

template <typename Container, typename T, typename Comparator = std::less<T>>
typename Container::value_type binarySearchRandomAccessIterator (const Container& container, T&& value, Comparator&& compare, typename Container::difference_type low, typename Container::difference_type high) {
    if (low > high)
        {std::cout << "Error!  Not found!\n";  return container[high];}
    const typename Container::difference_type mid = (low + high) / 2;
    const auto& t = compare.function(container[mid]);  // Using 'const T& t' does not compile.
    if (t == value)
        return container[mid];
    if (compare.comparator(t, value))  // 't' is less than 'value' according to compare.comparator, so search in the top half.
        return binarySearchRandomAccessIterator (container, value, compare, mid + 1, high);
    return binarySearchRandomAccessIterator (container, value, compare, low, mid - 1);  // i.e. 'value' is less than 't' according to compare.comparator, so search in the bottom half.
}

template <typename ForwardIterator, typename T, typename Comparator = std::less<T>>
typename std::iterator_traits<ForwardIterator>::value_type binarySearchNonRandomAccessIterator (ForwardIterator first, ForwardIterator last, T&& value, Comparator&& compare) {
    ForwardIterator it;
    typename std::iterator_traits<ForwardIterator>::difference_type count, step;
    count = std::distance(first, last);
    while (count > 0) {  // Binary search using iterators carried out.
        it = first;
        step = count / 2;
        std::advance(it, step);  // This is done in O(step) time since ForwardIterator is not a random-access iterator (else it is done in constant time).  But the good news is that 'step' becomes half as small with each iteration of this loop.
        const auto& t = compare.function(*it);  // Using 'const T& t' does not compile.
        if (compare.comparator(t, value)) {  // 't' is less than 'value' according to compare.comparator, so search in the top half.
            first = ++it;  // Thus first will move to one past the half-way point, and we search from there.
            count -= step + 1;  // count is decreased by half plus 1.
        }
        else  // 't' is greater than 'value' according to compare.comparator, so remain in the bottom half.
            count = step;  // 'count' and 'step' are both decreased by half.
    }
    if (compare.function(*first) != value)
        std::cout << "Error!  Not found!\n";
    return *first;
}

template <typename Container, typename T, typename Comparator = std::less<T>>  // Actually the version below could be used if Container has a random-access iterator.  It would be with the same time complexity since std::advance has O(1) time complexity for random-access iterators.
typename std::enable_if<std::is_same<typename std::iterator_traits<typename Container::iterator>::iterator_category, std::random_access_iterator_tag>::value, typename Container::value_type>::type
        binarySearch (const Container& container, T&& value, Comparator&& compare = {}) {
    std::cout << "Calling binarySearchWithRandomAccessIterator...\n";
    return binarySearchRandomAccessIterator (container, value, compare, 0, container.size() - 1);
}

// Overload used if Container does not have a random-access iterator.
template <typename Container, typename T, typename Comparator = std::less<T>>
typename std::enable_if<!std::is_same<typename std::iterator_traits<typename Container::iterator>::iterator_category, std::random_access_iterator_tag>::value, typename Container::value_type>::type
        binarySearch (const Container& container, T&& value, Comparator&& compare = {}) {
    std::cout << "Calling binarySearchNonRandomAccessIterator...\n";
    return binarySearchNonRandomAccessIterator (std::begin(container), std::end(container), value, compare);
}

template <typename Function, typename Comparator>
struct FunctionAndComparator {
    Function function;
    Comparator comparator;
    FunctionAndComparator (Function&& f, Comparator&& c) : function(std::forward<Function>(f)), comparator(std::forward<Comparator>(c)) {}
};

template <typename Function, typename Comparator = std::less<>>
FunctionAndComparator<std::decay_t<Function>, std::decay_t<Comparator>> functionAndComparator (Function&& f, Comparator&& c = {}) {
    return {std::forward<Function>(f), std::forward<Comparator>(c)};
}

#include <string>
#include <vector>
#include <list>

struct Person {
    std::string name;
    Person (const std::string& n) : name(n) {}
    std::string getName (int, char) const {return name;}  // int, char play no role in this simple example, but let's supposes that they are needed.
} *Bob = new Person("Bob"), *Frank = new Person("Frank"), *Mark = new Person("Mark"), *Tom = new Person("Tom"), *Zack = new Person("Zack");

const std::vector<Person*> peopleVector = {Bob, Frank, Mark, Tom, Zack};
const std::list<Person*> peopleList = {Bob, Frank, Mark, Tom, Zack};

int main() {
    Person* tom = binarySearch (peopleVector, "Tom", functionAndComparator([](const Person* p) {return p->getName(5,'a');}, [](const std::string& x, const std::string& y) {return x.compare(y) < 0;}));
    if (tom) std::cout << tom->getName(5,'a') << " found.\n";

    Person* bob = binarySearch (peopleVector, "Bob", functionAndComparator([](const Person* p) {return p->getName(3,'k');}));  // The default comparator, std::less<std::string>, is actually the same as the comparator used above.
    if (bob) std::cout << bob->getName(3,'k') << " found.\n";

    Person* frank = binarySearch (peopleList, "Frank", functionAndComparator([](const Person* p) {return p->getName(8,'b');}));
    if (frank) std::cout << frank->getName(8,'b') << " found.\n";

    Person* zack = binarySearch (peopleList, "Zack", functionAndComparator([](const Person* p) {return p->getName(2,'c');}));
    if (zack) std::cout << zack->getName(2,'c') << " found.\n";

    Person* mark = binarySearch (peopleList, "Mark", functionAndComparator([](const Person* p) {return p->getName(6,'d');}));
    if (mark) std::cout << mark->getName(6,'d') << " found.\n";
}

我认为

Person* person = binarySearch (people, "Tom",
  [](Person* p, int n, char c) {return p->getName(n,c);},
 [](const std::string& x, const std::string& y) {return x.compare(y) < 0;}, 5, 'a');

是一个可怕的语法。您的 binarySearch 函数负责太多事情。

但首先,出了什么问题:您出现模棱两可的错误是因为 lambda 不是 std::function。它试图从 lambda 中推断出 std::function 类型,但失败了,因为它们是不相关的类型。从其他地方推断 Args... 的能力无济于事。

您可以将 std::function 参数包装在:

template<class T>struct tag{using type=T;};
template<class Tag>using type_t=typename Tag::type;
template<class T>using identity=type_t<tag<T>>;

identity< std::function< whatever... > > 并且您的代码将开始编译(因为 Args... 是在其他地方推导出来的)。 identity<?> 阻止对该参数进行模板类型推导,因此编译器不再尝试,而是从 other 参数推导类型。

然而,这不是一个好的解决方案。

更好的解决方案是将 fc 的类型设为 FC -- 不要将它们设为 std::function完全没有。这消除了无意义的类型擦除开销,并消除了对 identity<?>

的需要

这仍然不是一个好的解决方案,因为您的模板函数会做很多事情,但很少有做得好的。相反,将您的操作分解为更简单的问题,然后将它们组合在一起:


首先,我们已经有了 std::equal_range,这将是一个比您可能编写的任何东西都更好的二进制搜索。编写一个 return 是单个元素并采用容器的函数似乎是合理的,因为使用迭代器很烦人。

为了完成这个,首先我们写一些基于范围的样板文件:

namespace adl_aux {
  using std::begin; using std::end;
  template<class R>
  auto adl_begin(R&&)->decltype(begin(std::declval<R>()));
  template<class R>
  auto adl_end(R&&)->decltype(end(std::declval<R>()));
}
template<class R>
using adl_begin = decltype(adl_aux::adl_begin(std::declval<R>));
template<class R>
using adl_end = decltype(adl_aux::adl_end(std::declval<R>));

template<class R>using iterator_t = adl_begin<R>;
template<class R>using value_t = std::remove_reference_t<decltype(*std::declval<iterator_t<R>>())>;

这使我们能够支持 std:: 容器和数组以及第 3 方可迭代容器和范围。 adl_ 的东西为我们做 beginend 的参数依赖查找。 iterator_tvalue_t 对范围的值和迭代器类型进行 SFINAE 友好确定。

现在,bin_search 在该样板之上:

template<class R, class T, class F=std::less<T>>
value_t<R>* bin_search( R&& r, T&& t, F&& f={} ) {
  using std::begin; using std::end;
  auto range = std::equal_range( begin(r), end(r), std::forward<T>(t), std::forward<F>(f) );
  if (range.first==range.second) return nullptr;
  return std::addressof( *range.first ); // in case someone overloaded `&`
}

其中 return 是一个指向元素 t 的指针 f 假设 R 在其下排序,否则 nullptr .

下一部分是您的订购混乱:

[](Person* p, int n, char c) {return p->getName(n,c);},
[](const std::string& x, const std::string& y) {return x.compare(y) < 0;}, 5, 'a'

首先,去掉那个 args...:

[](int n, char c){
  return [n,c](Person* p) {return p->getName(n,c);}
}(5,'a'),
[](const std::string& x, const std::string& y) {return x.compare(y) < 0;}

如果确实需要在一条线上完成,直接绑定即可。

接下来,我们要order_by:

template<class F, class C>
struct order_by_t : private F, private C {
  F const& f() const { return *this; }
  C const& c() const { return *this; }
  template<class T>
  auto f(T&&t)const
  ->decltype( std::declval<F const&>()(std::declval<T>()) )
  {
    return f()(std::forward<T>(t));
  }
  template<class T, class... Unused> // Unused to force lower priority
  auto f(T&&t, Unused&&... ) const
  -> std::decay_t<T>
  { return std::forward<T>(t); }
  template<class Lhs, class Rhs>
  bool operator()(Lhs&& lhs, Rhs&& rhs) const {
    return c()( f(std::forward<Lhs>(lhs)), f(std::forward<Rhs>(rhs)) );
  }
  template<class F0, class C0>
  order_by_t( F0&& f_, C0&& c_ ):
    F(std::forward<F0>(f_)), C(std::forward<C0>(c_))
  {}
};
template<class C=std::less<>, class F>
auto order_by( F&& f, C&& c={} )
-> order_by_t<std::decay_t<F>, std::decay_t<C>>
{ return {std::forward<F>(f), std::forward<C>(c)}; }

order_by 采用从域到范围的投影,以及可选的对该范围的排序,并在域上生成排序。

order_by(
  [](int n, char c){
    return [n,c](Person const* p)
    ->decltype(p->getName(n,c)) // SFINAE enabled
    {return p->getName(n,c);};
  }(5,'a'),
  [](const std::string& x, const std::string& y) {return x.compare(y) < 0;}
}

现在是根据您的要求对 Person const* 进行的订购。

然后我们将其输入 bin_search:

auto ordering = order_by(
  [](int n, char c){
    return [n,c](Person const* p)
    ->decltype(p->getName(n,c)) // SFINAE enabled
    {return p->getName(n,c);}
  }(5,'a'),
  [](const std::string& x, const std::string& y) {return x.compare(y) < 0;}
);
Person*const* p = bin_search( people, "Tom", ordering );

现在,必须注意使 order_by 成为 "transparent" 函数对象,它接受可以投影(在投影下)和不能投影(通过直接到比较器)。

这要求投影操作对 SFINAE 友好(即 "fail early")。为此,我明确地确定了它的 return 类型。 (下面我们看到这不是必需的,但可能在更复杂的情况下)。

Live example.

有趣的是,您的 [](const std::string& x, const std::string& y) {return x.compare(y) < 0;}std::string 上与 operator< 一致,因此您可以放弃它(并使 order_by 更简单)。但是,我怀疑您的实际用例需要它,这是一个有用的功能来强化 order_by with.

最后,注意这部分:

  [](int n, char c){
    return [n,c](Person const* p)
    ->decltype(p->getName(n,c)) // SFINAE enabled
    {return p->getName(n,c);}
  }(5,'a'),

很丑,可以换成:

  [](Person const* p)
    ->decltype(p->getName(5,'a')) // SFINAE enabled
    {return p->getName(5,'a');}

哪个没那么难看。另外,因为 lambda 的参数检查已经足够了,我们可以删除 SFINAE explicit return 类型的东西:

  [](Person const* p)
    {return p->getName(5,'a');}

我们完成了。 Simpler example:

auto ordering = order_by(
   [](Person const* p)
     {return p->getName(5,'a');}
);
Person*const* p = bin_search( people, "Tom", ordering );

甚至:

Person*const* p = bin_search( people, "Tom",
  order_by( [](Person const* p) {return p->getName(5,'a');} )
);

看起来没那么难看,不是吗?

哦,还有:

using std::literals;
Person*const* p = bin_search( people, "Tom"s,
  order_by( [](Person const* p) {return p->getName(5,'a');} )
);

可能有更好的性能,因为它会避免在每次比较时重复构建 std::string("Tom")。同样,return 和 std::string const&(如果可能)的 getName 也可以提高性能。 "projection lambda" 可能必须有一个 ->decltype(auto) 才能实现第二次提升。

我在上面使用了一些C++14。 std::remove_reference_t<?>(和类似的)别名可以替换为 typename std::remove_reference<?>::type,或者您可以编写自己的 _t 别名。使用 decltype(auto) 的建议在 C++11 中可以替换为 decltype(the return expression)

order_by_t使用继承来存储FC,因为它们很可能为空类,所以我想利用空基优化