NumPy 数组数组到 PyOpenCL vecs 数组
NumPy array of arrays to PyOpenCL array of vecs
我有一个包含数组的 NumPy 数组:
import numpy as np
import pyopencl as cl
someArray = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
现在,我想将此数组转换为 OpenCL vec4 数组,以便对其进行处理。例如:
context = cl.create_some_context()
queue = cl.CommandQueue()
program = cl.Program("""
__kernel void multiplyByTwo(__global const float32* someArrayAsOpenCLType, __global float32* result) {
gid = get_global_id(0);
vector = someArrayAsOpenCLType[gid];
result[gid] = vector * 2;
}
""").build()
someArrayAsOpenCLType = # something with someArray
result = # some other thing
program.multiplyByTwo(queue, someArray.shape, None, someArrayAsOpenCLType, result)
如何将 someArray 转换为 someArrayAsOpenCLType?
someArray
中的数据存储在主机内存中,这些数据必须复制到设备的缓冲内存中(someArrayAsOpenCLType
)。
内核在设备上执行并将结果存储在设备缓冲区(预分配:resultAsOpenCLType
)。
执行后,程序可能会将设备缓冲区的结果返回到主机内存(例如:cl.enqueue_copy(queue, result, resultAsOpenCLType)
)。
看一个简单的例子(但也许还有其他方法可以做到这一点):
import numpy as np
import pyopencl as cl
# Context
ctx = cl.create_some_context()
# Create queue
queue = cl.CommandQueue(ctx)
someArray = np.array([
[1, 2, 3, 4],
[5, 6, 7, 8]
]).astype(np.float32)
print ""
print("Input:")
print(someArray)
print("------------------------------------")
# Get mem flags
mf = cl.mem_flags
# Create a read-only buffer on device and copy 'someArray' from host to device
someArrayAsOpenCLType = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=someArray)
# Create a write-only buffer to get the result from device
resultAsOpenCLType = cl.Buffer(ctx, mf.WRITE_ONLY, someArray.nbytes)
# Creates a kernel in context
program = cl.Program(ctx, """
__kernel void multiplyByTwo(__global const float4 *someArrayAsOpenCLType, __global float4 *resultAsOpenCLType) {
int gid = get_global_id(0);
float4 vector = someArrayAsOpenCLType[gid];
resultAsOpenCLType[gid] = vector * (float) 2.0;
}
""").build()
# Execute
program.multiplyByTwo(queue, someArray.shape, None, someArrayAsOpenCLType, resultAsOpenCLType)
# Creates a buffer for the result (host memory)
result = np.empty_like(someArray)
# Copy the results from device to host
cl.enqueue_copy(queue, result, resultAsOpenCLType)
print("------------------------------------")
print("Output")
# Show the result
print (result)
执行后(带选项0
):
Choose platform:
[0] <pyopencl.Platform 'Intel(R) OpenCL' at 0x858ea0>
[1] <pyopencl.Platform 'Experimental OpenCL 2.0 CPU Only Platform' at 0x872880>
[2] <pyopencl.Platform 'NVIDIA CUDA' at 0x894a80>
Choice [0]:
Set the environment variable PYOPENCL_CTX='' to avoid being asked again.
Input:
[[ 1. 2. 3. 4.]
[ 5. 6. 7. 8.]]
------------------------------------
C:\Python27\lib\site-packages\pyopencl\__init__.py:59: CompilerWarning: Built kernel retrieved from cache. Original from-sour
ce build had warnings:
Build on <pyopencl.Device 'Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz' on 'Intel(R) OpenCL' at 0x86ca30> succeeded, but said:
Compilation started
Compilation done
Linking started
Linking done
Device build started
Device build done
Kernel <multiplyByTwo> was not vectorized
Done.
warn(text, CompilerWarning)
C:\Python27\lib\site-packages\pyopencl\__init__.py:59: CompilerWarning: From-binary build succeeded, but resulted in non-empt
y logs:
Build on <pyopencl.Device 'Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz' on 'Intel(R) OpenCL' at 0x86ca30> succeeded, but said:
Device build started
Device build done
Reload Program Binary Object.
warn(text, CompilerWarning)
------------------------------------
Output
[[ 2. 4. 6. 8.]
[ 10. 12. 14. 16.]]
英特尔网站上有关 OpenCL 的一些教程:
我有一个包含数组的 NumPy 数组:
import numpy as np
import pyopencl as cl
someArray = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
现在,我想将此数组转换为 OpenCL vec4 数组,以便对其进行处理。例如:
context = cl.create_some_context()
queue = cl.CommandQueue()
program = cl.Program("""
__kernel void multiplyByTwo(__global const float32* someArrayAsOpenCLType, __global float32* result) {
gid = get_global_id(0);
vector = someArrayAsOpenCLType[gid];
result[gid] = vector * 2;
}
""").build()
someArrayAsOpenCLType = # something with someArray
result = # some other thing
program.multiplyByTwo(queue, someArray.shape, None, someArrayAsOpenCLType, result)
如何将 someArray 转换为 someArrayAsOpenCLType?
someArray
中的数据存储在主机内存中,这些数据必须复制到设备的缓冲内存中(someArrayAsOpenCLType
)。
内核在设备上执行并将结果存储在设备缓冲区(预分配:resultAsOpenCLType
)。
执行后,程序可能会将设备缓冲区的结果返回到主机内存(例如:cl.enqueue_copy(queue, result, resultAsOpenCLType)
)。
看一个简单的例子(但也许还有其他方法可以做到这一点):
import numpy as np
import pyopencl as cl
# Context
ctx = cl.create_some_context()
# Create queue
queue = cl.CommandQueue(ctx)
someArray = np.array([
[1, 2, 3, 4],
[5, 6, 7, 8]
]).astype(np.float32)
print ""
print("Input:")
print(someArray)
print("------------------------------------")
# Get mem flags
mf = cl.mem_flags
# Create a read-only buffer on device and copy 'someArray' from host to device
someArrayAsOpenCLType = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=someArray)
# Create a write-only buffer to get the result from device
resultAsOpenCLType = cl.Buffer(ctx, mf.WRITE_ONLY, someArray.nbytes)
# Creates a kernel in context
program = cl.Program(ctx, """
__kernel void multiplyByTwo(__global const float4 *someArrayAsOpenCLType, __global float4 *resultAsOpenCLType) {
int gid = get_global_id(0);
float4 vector = someArrayAsOpenCLType[gid];
resultAsOpenCLType[gid] = vector * (float) 2.0;
}
""").build()
# Execute
program.multiplyByTwo(queue, someArray.shape, None, someArrayAsOpenCLType, resultAsOpenCLType)
# Creates a buffer for the result (host memory)
result = np.empty_like(someArray)
# Copy the results from device to host
cl.enqueue_copy(queue, result, resultAsOpenCLType)
print("------------------------------------")
print("Output")
# Show the result
print (result)
执行后(带选项0
):
Choose platform:
[0] <pyopencl.Platform 'Intel(R) OpenCL' at 0x858ea0>
[1] <pyopencl.Platform 'Experimental OpenCL 2.0 CPU Only Platform' at 0x872880>
[2] <pyopencl.Platform 'NVIDIA CUDA' at 0x894a80>
Choice [0]:
Set the environment variable PYOPENCL_CTX='' to avoid being asked again.
Input:
[[ 1. 2. 3. 4.]
[ 5. 6. 7. 8.]]
------------------------------------
C:\Python27\lib\site-packages\pyopencl\__init__.py:59: CompilerWarning: Built kernel retrieved from cache. Original from-sour
ce build had warnings:
Build on <pyopencl.Device 'Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz' on 'Intel(R) OpenCL' at 0x86ca30> succeeded, but said:
Compilation started
Compilation done
Linking started
Linking done
Device build started
Device build done
Kernel <multiplyByTwo> was not vectorized
Done.
warn(text, CompilerWarning)
C:\Python27\lib\site-packages\pyopencl\__init__.py:59: CompilerWarning: From-binary build succeeded, but resulted in non-empt
y logs:
Build on <pyopencl.Device 'Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz' on 'Intel(R) OpenCL' at 0x86ca30> succeeded, but said:
Device build started
Device build done
Reload Program Binary Object.
warn(text, CompilerWarning)
------------------------------------
Output
[[ 2. 4. 6. 8.]
[ 10. 12. 14. 16.]]
英特尔网站上有关 OpenCL 的一些教程: