在 OpenCL 中将整数转换为浮点数

Casting an integer to a float in OpenCL

这是我第一次 post 堆栈溢出,请耐心等待。

我目前正在编写 OpenCL 内核,需要使用内置的 sqrt 函数。但是,要使其正常工作,函数的参数必须是浮点数。我目前有一个整数值,需要将其转换为浮点数,以便执行 sqrt() 函数,然后将其转换回整数,以便将其存储到“magOut”数组中。

下面的代码应该可以更好地理解我正在尝试做的事情:

magOutput[workItemNum] = sqrt(xConv[workItemNum]*xConv[workItemNum] + yConv[workItemNum]*yConv[workItemNum]);

如果需要了解所需的应用程序,这里是完整代码:

__kernel void matrixMultiplication(__global int* input, __global int* xConv, __global int* yConv, __global int* size, __global int* magOutput){

int workItemNum = get_global_id(0); //Work item ID
int workGroupNum = get_group_id(0); //Work group ID
int localGroupID = get_local_id(0); //Work items ID within each work group

// size refers to the total size of a matrix. So for a 3x3 size = 9
float dim = *size;
int dim1 = *size; 

int row = sqrt(dim); // only square matrices are used and as such the sqrt of size produces the row length
int current_row = workItemNum/dim; // the current row is calculated by using the current workitem number divided by the total size of the matrix

int col = sqrt(dim); // only square matrices are used and as such the sqrt of size produces the column length
int current_col = workItemNum % dim1; // the current column is calculated by using the current workitem number modulus by the total size of the matrix

// printf("dimension: %i \n",localGroupID);

// This if statement excludes all boundary pixels from the calculation as you require the neighbouring pixel cells 
// for this calculation
if (current_col == 0 || current_col == col-1 || current_row == 0 || current_row == row - 1){
    /*===============================================================================================================
    * The xConv array performs the kernal convultion of the input grey scale values with the following matrix:
    *
    *                            [-1  0 +1]
    * X - Directional Kernel  =  [-2  0 +2]
    *                            [-1  0 +1]
    * 
    * This scans across the X direction of the image and enhances all edges in the X-direction 
    * ===============================================================================================================
    */
    xConv[workItemNum] =  input[(current_col - 1)*col + current_row - 1]*-1 
             + input[(current_col)*col + current_row - 1]*0 
             + input[(current_col + 1)*col + current_row - 1]*1 
             + input[(current_col - 1)*col + current_row]*-2 
             + input[(current_col)*col + current_row]*0 
             + input[(current_col + 1)*col + current_row]*2 
             + input[(current_col - 1)*col + current_row + 1]*-1 
             + input[(current_col)*col + current_row + 1]*0 
             + input[(current_col + 1)*col + current_row + 1]*1;

    /*===============================================================================================================
    * The xConv array performs the kernal convultion of the input grey scale values with the following matrix:
    *
    *                            [+1 +2 +1]
    * Y - Directional Kernel  =  [ 0  0  0]
    *                            [-1 -2 -1]
    * 
    * This scans across the Y direction of the image and enhances all edges in the Y-direction 
    * ===============================================================================================================
    */
    yConv[workItemNum] = input[(current_col - 1)*col + current_row - 1]*-1 
             + input[(current_col)*col + current_row - 1]*-2 
             + input[(current_col + 1)*col + current_row - 1]*-1 
             + input[(current_col - 1)*col + current_row]*0 
             + input[(current_col)*col + current_row]*0 
             + input[(current_col + 1)*col + current_row]*0 
             + input[(current_col - 1)*col + current_row + 1]*1 
             + input[(current_col)*col + current_row + 1]*2 
             + input[(current_col + 1)*col + current_row + 1]*1;
}

//===============================================================================================================
// Calculates the convolution matrix of the X and Y arrays. Does so by squaring each item of the X and Y arrays,  
// adding them and taking the square root. This is the basic magnitude formula. This is done for by each workItem
//===============================================================================================================
magOutput[workItemNum] = sqrt(xConv[workItemNum]*xConv[workItemNum] + yConv[workItemNum]*yConv[workItemNum]);
}

有什么建议吗?

欢迎使用 Whosebug!

所以您基本上有一个 int 数据类型的二维向量,并且想要计算它的长度。大多数 OpenCL C 只是标准 C99 code/syntax,因此最直接的方法是使用标准 C-style 类型转换:

magOutput[workItemNum] = (int)(sqrt((float)xConv[workItemNum]*(float)xConv[workItemNum] + (float)yConv[workItemNum]*(float)yConv[workItemNum])+0.5f);

+0.5f 用于正确舍入:将 float 转换为 int 总是向下舍入,例如 (int)3.9f 将转换为 3。通过在转换前立即添加 +0.5f,结果会正确舍入 up/down。请注意,我首先转换为 float 然后进行平方;否则在乘法过程中可能会出现整数溢出。


一个可能更快的方法是这样的:我在这里只从全局内存加载值 xConv[workItemNum]/yConv[workItemNum] 一次(这真的很慢),将它们转换为 float 和将它们存储在私有内存(寄存器)xConvf/yConvf 中。然后我进行长度计算和舍入,然后将结果写回慢速全局内存中的 magOutput[workItemNum]

const float xConvf = (float)xConv[workItemNum], yConvf = (float)yConv[workItemNum];
magOutput[workItemNum] = (int)(sqrt(xConvf*xConvf + yConvf*yConvf)+0.5f);

如果你真的很喜欢 OpenCL C 中的 built-in 数学功能,你也可以这样做(应该和第二种方法一样快):

magOutput[workItemNum] = (int)(length(float2((float)xConv[workItemNum], (float)yConv[workItemNum]))+0.5f);

所以有 2 个要点给你:

  • OpenCL 中的类型转换就像在 C 中一样工作:(float)x.
  • 读取/写入全局内存(您的内核参数数组)真的很慢。只在局部变量(这些是私有内存space)中加载一次必要的值,以便在寄存器中获取数字,然后在寄存器中进行运算,然后写回全局内存一次。

虽然这不会影响您的 OpenCL C 代码,但我建议使用此轻量级 OpenCL-Wrapper 进行 C++ 开发。这将您的 CPU 代码中的 OpenCL 控制逻辑减少到大约 1/4,并使开发更加容易。