在 CUDA 中的大型数组转换期间删除项目
Removing items during a large array transform in CUDA
给定一个大数组 A,这些值被转换为数组 B,所以 B = Transform(A)。其中A和B是不同的类型,转换Transform()是相当昂贵的,B的数据量比A大。但是还要根据一个谓词Keep(B)过滤掉结果。
有没有一种不先写出 B 数组然后修剪要保留的 B 条目的合适方法来做到这一点?
我开始努力尝试:
typedef int A;
struct B { int a, b, c; };
struct FTransform : thrust::unary_function<A, B>
{
__device__ B operator()(A a) const { return B{ a, a, a }; }
};
struct FKeep : thrust::unary_function<B, bool>
{
__device__ bool operator()(B b) const { return (b.a & 1) == 0; }
};
thrust::device_vector<B> outputs(8);
thrust::device_vector<A> inputs(8);
std::generate(inputs.begin(), inputs.end(), rand);
auto first = thrust::make_transform_iterator(inputs.begin(), FTransform());
auto last = thrust::make_transform_iterator(inputs.end(), FTransform());
auto end = thrust::copy_if(first, last, outputs, FKeep());
但是这会产生编译错误(Cuda 9.2):
thrust/iterator/iterator_traits.h(49): error : class "thrust::device_vector<B, thrust::device_malloc_allocator<B>>" has no member "iterator_category"
thrust/detail/copy_if.inl(78): error : incomplete type is not allowed
thrust/detail/copy_if.inl(80): error : no instance of overloaded function "select_system" matches the argument list
thrust/detail/copy_if.inl(80): error : no instance of overloaded function "thrust::copy_if" matches the argument list
这里:
auto end = thrust::copy_if(first, last, outputs, FKeep());
^^^^^^^
outputs
不是迭代器。你应该在那里传递 outputs.begin()
。
有了这个改变,你的代码就可以为我编译了。
给定一个大数组 A,这些值被转换为数组 B,所以 B = Transform(A)。其中A和B是不同的类型,转换Transform()是相当昂贵的,B的数据量比A大。但是还要根据一个谓词Keep(B)过滤掉结果。
有没有一种不先写出 B 数组然后修剪要保留的 B 条目的合适方法来做到这一点?
我开始努力尝试:
typedef int A;
struct B { int a, b, c; };
struct FTransform : thrust::unary_function<A, B>
{
__device__ B operator()(A a) const { return B{ a, a, a }; }
};
struct FKeep : thrust::unary_function<B, bool>
{
__device__ bool operator()(B b) const { return (b.a & 1) == 0; }
};
thrust::device_vector<B> outputs(8);
thrust::device_vector<A> inputs(8);
std::generate(inputs.begin(), inputs.end(), rand);
auto first = thrust::make_transform_iterator(inputs.begin(), FTransform());
auto last = thrust::make_transform_iterator(inputs.end(), FTransform());
auto end = thrust::copy_if(first, last, outputs, FKeep());
但是这会产生编译错误(Cuda 9.2):
thrust/iterator/iterator_traits.h(49): error : class "thrust::device_vector<B, thrust::device_malloc_allocator<B>>" has no member "iterator_category"
thrust/detail/copy_if.inl(78): error : incomplete type is not allowed
thrust/detail/copy_if.inl(80): error : no instance of overloaded function "select_system" matches the argument list
thrust/detail/copy_if.inl(80): error : no instance of overloaded function "thrust::copy_if" matches the argument list
这里:
auto end = thrust::copy_if(first, last, outputs, FKeep());
^^^^^^^
outputs
不是迭代器。你应该在那里传递 outputs.begin()
。
有了这个改变,你的代码就可以为我编译了。