如何编写具有并行执行策略的 C++17 算法

Question

我想写一个C++17的并行执行算法，但是遇到了一些麻烦。让我们从代码开始：

#if __has_include(<execution>)
#include <execution>
#include <thread>
#include <future>
#endif

template<class RandomAccessIterator>
inline auto mean(RandomAccessIterator first, RandomAccessIterator last)
{
    auto it = first;
    auto mu = *first;
    decltype(mu) i = 2;
    while(++it != last)
    {
        mu += (*it - mu)/i;
        i += 1;
    }
    return mu;
}


#if __has_include(<execution>)
template<class ExecutionPolicy, class RandomAccessIterator>
inline auto mean(ExecutionPolicy&& exec_pol, RandomAccessIterator first, RandomAccessIterator last) {
    using Real = typename std::iterator_traits<RandomAccessIterator>::value_type;
    //static_assert(std::is_execution_policy_v<ExecutionPolicy>, "First argument must be an execution policy.");
    if (exec_pol == std::execution::par) {
        size_t elems = std::distance(first, last);
        if (elems*sizeof(Real) < /*guestimate*/ 4096) {
            return mean(first, last);
        }

        unsigned threads = std::thread::hardware_concurrency();
        if (threads == 0) {
            threads = 2;
        }
        std::vector<std::future<Real>> futures;
        size_t elems_per_thread = elems/threads;
        auto it = first;
        for (unsigned i = 0; i < threads -1; ++i) {

            futures.push_back(std::async(std::launch::async, &mean<RandomAccessIterator>, it, it + elems_per_thread));
            it += elems_per_thread;
        }
        futures.push_back(std::async(std::launch::async, &mean<RandomAccessIterator>, it, last));

        Real mu = 0;
        for (auto fut : futures) {
            mu += fut.get();
        }
        mu /= threads;
        return mu;
    }
    else { // should have else-if for various types of execution policies, but let's save that for later.
         return mean(first, last);
    }
}
#endif

好的，所以问题：

我首先通过 const & 传递 ExecutionPolicy 参数。 static_assert 通过了，但后来我在 if (exec_pol == std::execution::par) 上遇到了编译错误，即：

 error: no match for ‘operator==’ (operand types are ‘const __pstl::execution::v1::parallel_policy’ and ‘const __pstl::execution::v1::parallel_policy’)
  117 |     if (exec_pol == std::execution::par) {
      |         ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~

然后我查看了 /usr/include/c++/9/pstl/algorithm_impl.h，在其中，他们通过将 ExecutionPolicy 移动和转发到各个地方来传递它，所以我想我应该这样做。但这并没有解决任何问题，所以我查看了 /usr/include/c++/9/pstl/parallel_backend_tbb.h。在那个文件中，他们甚至不检查并行执行策略是什么！例如，上述文件中的几行：

//! Evaluation of brick f[i,j) for each subrange [i,j) of [first,last)
// wrapper over tbb::parallel_for
template <class _ExecutionPolicy, class _Index, class _Fp>
void
__parallel_for(_ExecutionPolicy&&, _Index __first, _Index __last, _Fp __f)
{
    tbb::this_task_arena::isolate([=]() {
        tbb::parallel_for(tbb::blocked_range<_Index>(__first, __last), __parallel_for_body<_Index, _Fp>(__f));
    });
}

所以我是否从根本上误解了如何使用 C++17 并行执行策略编写并行算法？如果没有，如何检查执行策略并正确使用？

Answer 1

您需要检查政策的类型，也许是

if constexpr(std::is_same_v
               <std::remove_reference_t<ExecutionPolicy>,
                std::execution::parallel_policy>)

Answer 2

取 ExecutionPolicy&& exec_pol 值：ExecutionPolicy exec_pol。它是一个标签。通过转发引用获取只会混淆事情。

测试类型或标签分派：

if constexpr(std::is_same_v<ExecutionPolicy,
                        std::execution::parallel_policy>)

正如@Davis 的回答所暗示的那样。

如果你不想按值取值（你应该按值取值），你可以使用 std::decay_t 或 std::remove_ref_t< std::remove_cv_t< ExecutionPolicy > > 去掉 cv/ref 完美转发店铺类型。

但是，还是不要那样做。

如何编写具有并行执行策略的 C++17 算法

How to write a C++17 algorithm with a parallel execution policy

parallel-processing

tbb

c++17

如何*编写*具有并行执行策略的 C++17 算法

How to *write* a C++17 algorithm with a parallel execution policy

parallel-processing

tbb

c++17

如何编写具有并行执行策略的 C++17 算法

How to write a C++17 algorithm with a parallel execution policy