使用自定义顺序遍历 boost multi_index

Question

我有一个带有多个索引的 boost multi_index 容器。如何使用迭代时指定的自定义比较来迭代元素。

例如，假定 Element::name 和 Element::index 由 multi_index 容器索引

bool myCompare(const Element &lhs, const Element &rhs)
{
    if(lhs.name == rhs.name)
    {
        return lhs.index < rhs.index;
    }
    else
    {
        return lhs.name < rhs.name;
    }
}

但是，迭代不应限于上述情况，而是允许根据类似于 SQL SELECT 排序的元素的索引成员进行任何排序。

Answer 1

这不是多索引容器的特性。

但是，您可以轻松地创建一个额外的临时索引：

std::vector<boost::reference_wrapper<Element const> > temporary(table.begin(), table.end());

现在您可以根据自定义比较对名为 ordering 的索引进行排序：

std::sort(temporary.begin(), temporary.end(), myCompare);

// iterate table in that order:
for(Element const& e: temporary)
    std::cout << e.index << "\t" << e.name << "\n";

Bonus: You easily persist that ordering in a multi_index::random_accesss index using rearrange:
table.get<external>().rearrange(temporary.begin());
(where external tags a random access index).

更多注意事项：

您的谓词没有定义适当的弱全序。看来您可能想这样做：

#include <tuple>
bool myCompare(const Element &lhs, const Element &rhs) {
    return std::tie(lhs.name, lhs.index) < std::tie(rhs.name, rhs.index);
}

演示

Live On Coliru

#include <boost/multi_index_container.hpp>
#include <boost/multi_index/random_access_index.hpp>

struct Element {
    int index;
    std::string name;
};

#include <tuple>
bool myCompare(const Element &lhs, const Element &rhs) {
    return std::tie(lhs.name, lhs.index) < std::tie(rhs.name, rhs.index);
}

namespace bmi = boost::multi_index;

using Table = boost::multi_index_container<Element,
      bmi::indexed_by<
        bmi::random_access<bmi::tag<struct external> >
      > >;

#include <iostream>
#include <vector> // for the temporary index

int main() {
    Table table;

    // generate 30 random records...
    std::generate_n(back_inserter(table), 30,
            []{ return Element { rand()%100, std::string(1, 'a'+rand()%26) }; }
        );

    {
        // temporary index:
        std::vector<boost::reference_wrapper<Element const> > temporary(table.begin(), table.end());

        // iterate table in the order specified by myCompare:
        std::sort(temporary.begin(), temporary.end(), myCompare);

        for(Element const& e: temporary)
            std::cout << e.index << "\t" << e.name << "\n";

        // now to rearrange a random-access index on the multi-index container:
        table.get<external>().rearrange(temporary.begin());
    }
}

打印（例如）

你可以看到，在名称相同的地方，索引较低的在前。

更新到评论：

Hmm? Are you asking for magic fairy dust?

你是求魔仙尘吗？没有魔法。

DBMS 不使用任何异常的东西。他们通常使用 BTree，与您花园下面的数据结构非常相似品种 std::map，或者 bmi::ordered_[non_]unique 指数。所以在这方面 Boost MultiIndex 是（接近）你想要的，已经。

RDBMS-es 给 table 带来的主要补充是持久性。有了它，一切都变得更有用、更具可扩展性并且……更少高效的。

So, don't look at DBMS-es for the holy grail of efficiency.

内存中的键值数据存储被广泛用于提高性能是有原因的。另一个重要的注意事项是检索一个 SQL 中的有序结果集根本无法正常执行，除非索引已经存在。

现在，如果你想要最好的性能，你可能想要设计你自己的数据结构（也许在 Boost 的帮助下侵入性），但看到你提出这些问题的水平，我会在之前长时间使用 Boost Multi Index 你做。只需创建您有时可能需要组合的索引，然后编写一些 "loopy code" ¹ 根据需要组合它们。

想法

现在跳出框框思考，您可能想知道如何"easily"实现二级排序以一种 DBMS 的方式引擎可能。

就像 DBMS 引擎一样，最简单的事情就是准备好一个始终保持最新的索引： Demo

using Table = boost::multi_index_container<Element,
    bmi::indexed_by<
        bmi::sequenced<bmi::tag<struct insertion_order> >,
        // ready made index for traversal in composite order
        bmi::ordered_non_unique<bmi::tag<struct readymade>, 
            bmi::composite_key<Element,
                bmi::member<Element, std::string, &Element::name>,
                bmi::member<Element, int, &Element::index> 
            >
        >
    > >;

int main() {
    // generate 30 random records...
    Table table;
    std::generate_n(back_inserter(table), 30, []{ return Element { rand()%100, std::string(1, 'a'+rand()%26) }; });

    // effectively "zero" cost iteration there:
    for(Element const& e: table.get<readymade>())
        std::cout << e.index << "\t" << e.name << "\n";
}

请注意，您也可以在 Boost MultiIndex 中部分使用 有序复合索引：

    for(Element const& e: boost::make_iterator_range(table.get<readymade>().equal_range("y")))
        std::cout << e.index << "\t" << e.name << "\n";

打印（对于与上面相同的随机种子）：

11  y
13  y
72  y

或者，您可以定义单独的索引，并将它们与您自己的算法一起使用以实现二级排序：

using Table = boost::multi_index_container<Element,
    bmi::indexed_by<
        bmi::sequenced<bmi::tag<struct insertion_order> >,

        // separate indices that we might combine in some way later::
        bmi::ordered_non_unique<bmi::tag<struct by_index>, bmi::member<Element, int, &Element::index> >,
        bmi::ordered_non_unique<bmi::tag<struct by_name>,  bmi::member<Element, std::string, &Element::name> >
    > >;

现在合并索引成为 "engine" 的工作，或者在本例中是您的算法。这是一个想法：

template <typename Index1, typename Index2, typename Table, typename Function>
    void SelectOrderBy2(Table const& table, Function function) {

        using T = typename Table::value_type const;
        auto& idx1 = table.template get<Index1>();
        auto& idx2 = table.template get<Index2>();

        auto it = idx1.begin(), end = idx1.end();

        while (it!=end) {
            auto next = idx1.upper_bound(idx1.key_extractor()(*it));

            std::set<boost::reference_wrapper<T>, typename Table::template index<Index2>::type::value_compare>
                set(idx2.value_comp());

            while (it != next)
                set.insert(set.end(), boost::cref(*it++));

            for (auto& e: set)
                function(e);
        }
    }

这是一些非常容易出错的模板代码，但您可以非常简单地使用它：

SelectOrderBy2<by_name, by_index>(
        table, 
        [](Element const& e) { std::cout << e.index << "\t" << e.name << "\n"; }
    );

看到它 Live On Coliru 也

¹ 也许更好地称为 generic/special-purpose 算法 ... :)

Answer 2

添加@sehe 的回答，如果你真的想快速访问任何属性组合，这可以通过一些元编程和一些组合来完成，如中所示this example. Generally speaking, if you have N attributes C(N,floor(N/2)) indices are required (C(a,b) stands for a choose b)：例如，要处理 4 个属性，您必须定义 6 个不同的索引。这仅涵盖相等查询，不包括 a<5 或 10<a<=12.

等范围

使用自定义顺序遍历 boost multi_index

Iterate through boost multi_index with custom ordering

c++

containers

boost

boost-multi-index

演示

想法