Cuda thrust::device_vector 从特定范围获取指针
Cuda thrust::device_vector get pointer from specific range
我有一个向量的向量:
thrust::device_vector weights_;
这是一个连续的内存量,其中每w个项目,
表示一个向量。
在我的一个函数中,我将该范围的开始和结束作为参数传递,如下所示:
__host__ ann::d_vector ann::prop_layer (
unsigned int weights_begin,
unsigned int weights_end,
ann::d_vector & input
) const
然后,我将复制到一个新的向量中,
然后得到一个我可以在内核中使用的原始指针:
thrust::device_vector<float> weights ( weights_.begin() + weights_begin,
weights_.begin() + weights_end );
float * weight_ptr = thrust::raw_pointer_cast( weights.data() );
some_kernel<<<numBlocks,numThreads>>>( weight_ptr, weight.size() );
- 我可以从该范围获取指针,而不先将其复制到新向量吗?对我来说,这似乎是在浪费 copy-realloc。
- 如果我无法从该范围获取指针,我是否可以至少为该范围分配一个向量,而不复制实际值?
Can I get a pointer from that range, without first copying it to a new vector? That seems like a waste of copy-realloc to me.
是的,您可以获得指向该范围的指针。
float * weight_ptr = thrust::raw_pointer_cast( weights_.data() ) + weights_begin;
In case I can't get a pointer from that range, can I at least assign a vector to that range, without copying the actual values?
不,推力矢量无法实例化 "on top" 现有数据。
我有一个向量的向量:
thrust::device_vector weights_;
这是一个连续的内存量,其中每w个项目, 表示一个向量。
在我的一个函数中,我将该范围的开始和结束作为参数传递,如下所示:
__host__ ann::d_vector ann::prop_layer (
unsigned int weights_begin,
unsigned int weights_end,
ann::d_vector & input
) const
然后,我将复制到一个新的向量中, 然后得到一个我可以在内核中使用的原始指针:
thrust::device_vector<float> weights ( weights_.begin() + weights_begin,
weights_.begin() + weights_end );
float * weight_ptr = thrust::raw_pointer_cast( weights.data() );
some_kernel<<<numBlocks,numThreads>>>( weight_ptr, weight.size() );
- 我可以从该范围获取指针,而不先将其复制到新向量吗?对我来说,这似乎是在浪费 copy-realloc。
- 如果我无法从该范围获取指针,我是否可以至少为该范围分配一个向量,而不复制实际值?
Can I get a pointer from that range, without first copying it to a new vector? That seems like a waste of copy-realloc to me.
是的,您可以获得指向该范围的指针。
float * weight_ptr = thrust::raw_pointer_cast( weights_.data() ) + weights_begin;
In case I can't get a pointer from that range, can I at least assign a vector to that range, without copying the actual values?
不,推力矢量无法实例化 "on top" 现有数据。