如何*编写*具有并行执行策略的 C++17 算法
How to *write* a C++17 algorithm with a parallel execution policy
我想写一个C++17的并行执行算法,但是遇到了一些麻烦。让我们从代码开始:
#if __has_include(<execution>)
#include <execution>
#include <thread>
#include <future>
#endif
template<class RandomAccessIterator>
inline auto mean(RandomAccessIterator first, RandomAccessIterator last)
{
auto it = first;
auto mu = *first;
decltype(mu) i = 2;
while(++it != last)
{
mu += (*it - mu)/i;
i += 1;
}
return mu;
}
#if __has_include(<execution>)
template<class ExecutionPolicy, class RandomAccessIterator>
inline auto mean(ExecutionPolicy&& exec_pol, RandomAccessIterator first, RandomAccessIterator last) {
using Real = typename std::iterator_traits<RandomAccessIterator>::value_type;
//static_assert(std::is_execution_policy_v<ExecutionPolicy>, "First argument must be an execution policy.");
if (exec_pol == std::execution::par) {
size_t elems = std::distance(first, last);
if (elems*sizeof(Real) < /*guestimate*/ 4096) {
return mean(first, last);
}
unsigned threads = std::thread::hardware_concurrency();
if (threads == 0) {
threads = 2;
}
std::vector<std::future<Real>> futures;
size_t elems_per_thread = elems/threads;
auto it = first;
for (unsigned i = 0; i < threads -1; ++i) {
futures.push_back(std::async(std::launch::async, &mean<RandomAccessIterator>, it, it + elems_per_thread));
it += elems_per_thread;
}
futures.push_back(std::async(std::launch::async, &mean<RandomAccessIterator>, it, last));
Real mu = 0;
for (auto fut : futures) {
mu += fut.get();
}
mu /= threads;
return mu;
}
else { // should have else-if for various types of execution policies, but let's save that for later.
return mean(first, last);
}
}
#endif
好的,所以问题:
- 我首先通过
const &
传递 ExecutionPolicy
参数。 static_assert
通过了,但后来我在 if (exec_pol == std::execution::par)
上遇到了编译错误,即:
error: no match for ‘operator==’ (operand types are ‘const __pstl::execution::v1::parallel_policy’ and ‘const __pstl::execution::v1::parallel_policy’)
117 | if (exec_pol == std::execution::par) {
| ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
然后我查看了 /usr/include/c++/9/pstl/algorithm_impl.h
,在其中,他们通过将 ExecutionPolicy
移动和转发到各个地方来传递它,所以我想我应该这样做。但这并没有解决任何问题,所以我查看了 /usr/include/c++/9/pstl/parallel_backend_tbb.h
。在那个文件中,他们甚至不检查并行执行策略是什么!例如,上述文件中的几行:
//! Evaluation of brick f[i,j) for each subrange [i,j) of [first,last)
// wrapper over tbb::parallel_for
template <class _ExecutionPolicy, class _Index, class _Fp>
void
__parallel_for(_ExecutionPolicy&&, _Index __first, _Index __last, _Fp __f)
{
tbb::this_task_arena::isolate([=]() {
tbb::parallel_for(tbb::blocked_range<_Index>(__first, __last), __parallel_for_body<_Index, _Fp>(__f));
});
}
所以我是否从根本上误解了如何使用 C++17 并行执行策略编写并行算法?如果没有,如何检查执行策略并正确使用?
您需要检查政策的类型,也许是
if constexpr(std::is_same_v
<std::remove_reference_t<ExecutionPolicy>,
std::execution::parallel_policy>)
取 ExecutionPolicy&& exec_pol
值:ExecutionPolicy exec_pol
。它是一个标签。通过转发引用获取只会混淆事情。
测试类型或标签分派:
if constexpr(std::is_same_v<ExecutionPolicy,
std::execution::parallel_policy>)
正如@Davis 的回答所暗示的那样。
如果你不想按值取值(你应该按值取值),你可以使用 std::decay_t
或 std::remove_ref_t< std::remove_cv_t< ExecutionPolicy > >
去掉 cv/ref 完美转发店铺类型。
但是,还是不要那样做。
我想写一个C++17的并行执行算法,但是遇到了一些麻烦。让我们从代码开始:
#if __has_include(<execution>)
#include <execution>
#include <thread>
#include <future>
#endif
template<class RandomAccessIterator>
inline auto mean(RandomAccessIterator first, RandomAccessIterator last)
{
auto it = first;
auto mu = *first;
decltype(mu) i = 2;
while(++it != last)
{
mu += (*it - mu)/i;
i += 1;
}
return mu;
}
#if __has_include(<execution>)
template<class ExecutionPolicy, class RandomAccessIterator>
inline auto mean(ExecutionPolicy&& exec_pol, RandomAccessIterator first, RandomAccessIterator last) {
using Real = typename std::iterator_traits<RandomAccessIterator>::value_type;
//static_assert(std::is_execution_policy_v<ExecutionPolicy>, "First argument must be an execution policy.");
if (exec_pol == std::execution::par) {
size_t elems = std::distance(first, last);
if (elems*sizeof(Real) < /*guestimate*/ 4096) {
return mean(first, last);
}
unsigned threads = std::thread::hardware_concurrency();
if (threads == 0) {
threads = 2;
}
std::vector<std::future<Real>> futures;
size_t elems_per_thread = elems/threads;
auto it = first;
for (unsigned i = 0; i < threads -1; ++i) {
futures.push_back(std::async(std::launch::async, &mean<RandomAccessIterator>, it, it + elems_per_thread));
it += elems_per_thread;
}
futures.push_back(std::async(std::launch::async, &mean<RandomAccessIterator>, it, last));
Real mu = 0;
for (auto fut : futures) {
mu += fut.get();
}
mu /= threads;
return mu;
}
else { // should have else-if for various types of execution policies, but let's save that for later.
return mean(first, last);
}
}
#endif
好的,所以问题:
- 我首先通过
const &
传递ExecutionPolicy
参数。static_assert
通过了,但后来我在if (exec_pol == std::execution::par)
上遇到了编译错误,即:
error: no match for ‘operator==’ (operand types are ‘const __pstl::execution::v1::parallel_policy’ and ‘const __pstl::execution::v1::parallel_policy’)
117 | if (exec_pol == std::execution::par) {
| ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
然后我查看了 /usr/include/c++/9/pstl/algorithm_impl.h
,在其中,他们通过将 ExecutionPolicy
移动和转发到各个地方来传递它,所以我想我应该这样做。但这并没有解决任何问题,所以我查看了 /usr/include/c++/9/pstl/parallel_backend_tbb.h
。在那个文件中,他们甚至不检查并行执行策略是什么!例如,上述文件中的几行:
//! Evaluation of brick f[i,j) for each subrange [i,j) of [first,last)
// wrapper over tbb::parallel_for
template <class _ExecutionPolicy, class _Index, class _Fp>
void
__parallel_for(_ExecutionPolicy&&, _Index __first, _Index __last, _Fp __f)
{
tbb::this_task_arena::isolate([=]() {
tbb::parallel_for(tbb::blocked_range<_Index>(__first, __last), __parallel_for_body<_Index, _Fp>(__f));
});
}
所以我是否从根本上误解了如何使用 C++17 并行执行策略编写并行算法?如果没有,如何检查执行策略并正确使用?
您需要检查政策的类型,也许是
if constexpr(std::is_same_v
<std::remove_reference_t<ExecutionPolicy>,
std::execution::parallel_policy>)
取 ExecutionPolicy&& exec_pol
值:ExecutionPolicy exec_pol
。它是一个标签。通过转发引用获取只会混淆事情。
测试类型或标签分派:
if constexpr(std::is_same_v<ExecutionPolicy,
std::execution::parallel_policy>)
正如@Davis 的回答所暗示的那样。
如果你不想按值取值(你应该按值取值),你可以使用 std::decay_t
或 std::remove_ref_t< std::remove_cv_t< ExecutionPolicy > >
去掉 cv/ref 完美转发店铺类型。
但是,还是不要那样做。