加速:寻找 vDSP_maxviD 的指针地址
Accelerate: Finding pointer address for vDSP_maxviD
我正在 objective-c
从事机器学习项目。使用以下代码:
unsigned cStride = multiArray.strides[2].intValue;
unsigned hStride = multiArray.strides[3].intValue;
unsigned wStride = multiArray.strides[4].intValue;
for (unsigned h = 0; h < height; h++) {
for (unsigned w = 0; w < width; w++) {
unsigned highestClass = 0;
double highest = -DBL_MAX;
for (unsigned c = 0; c < channels; c++) {
unsigned offset = c * cStride + h * hStride + w * wStride;
double score = pointer[offset];
if (score > highest) {
highestClass = c;
highest = score;
}
}
// futher code
}
}
到目前为止一切顺利,运行良好。但是我遇到了 Accelerate
,我想为此 class 实现它,因为它可能比当前实现更快。
我决定实施vDSP_maxviD
。 https://developer.apple.com/documentation/accelerate/1449682-vdsp_maxvid?language=objc
他们的实现是这样写的:
__A Single-precision real input vector.
__I Stride for A
__C Output scalar
__IC Output scalar index
__N The number of elements to process
以及它们的底层 Accelerate
代码:
*C = -INFINITY;
for (n = 0; n < N; ++n)
{
if (*C < A[n * I])
{
*C = A[n * I];
*IC = n * I;
}
}
现在我开始思考这种情况下的参数需要变成什么:
__A
会变成pointer
__C
会变成highest
__IC
会变成highestClass
__N
会变成channels
但是,我将 __I
留空了。这是因为我不知道如何找到正确的 pointer
地址。我以前使用的代码:
unsigned offset = c * cStride + h * hStride + w * wStride;
然而,我们现在错过了 c
,它在 Accelerate
框架中被 n
取代了。
我的 pointer
索引的正确计算是什么?
编辑:删除了不必要的循环
我会尝试这样的事情(未测试):
uint32_t cStride = multiArray.strides[2].intValue;
uint16_t hStride = multiArray.strides[3].intValue;
uint8_t wStride = multiArray.strides[4].intValue;
for (uint16_t h = 0; h < height; h++) {
for (uint16_t w = 0; w < width; w++) {
uint8_t highestClass = 0;
double highest = -DBL_MAX;
//not sure about the type
double * tempPointer = &pointer[h * hStride + w * wStride];
vDSP_maxviD(tempPointer, cStride, &highest, &highestClass, channels);
// further code
}
}
问题是由于使用 pointer
作为 vDSP_maxviD
的第一个参数而忽略了 h
和 w
依赖偏移量。
尝试:
double *channels_pointer = &pointer[h * hStride + w * wStride];
vDSP_maxviD(channels_pointer, cStride, &highest, &highestClass, channels);
感谢 Grzegorz Owsiany 最终解决方案是:
double *channels_pointer = &pointer[h * hStride + w * wStride];
vDSP_maxviD(channels_pointer, cStride, &highest, &highestClass, channels);
不过,我还需要做:
highestClass /= cStride;
我正在 objective-c
从事机器学习项目。使用以下代码:
unsigned cStride = multiArray.strides[2].intValue;
unsigned hStride = multiArray.strides[3].intValue;
unsigned wStride = multiArray.strides[4].intValue;
for (unsigned h = 0; h < height; h++) {
for (unsigned w = 0; w < width; w++) {
unsigned highestClass = 0;
double highest = -DBL_MAX;
for (unsigned c = 0; c < channels; c++) {
unsigned offset = c * cStride + h * hStride + w * wStride;
double score = pointer[offset];
if (score > highest) {
highestClass = c;
highest = score;
}
}
// futher code
}
}
到目前为止一切顺利,运行良好。但是我遇到了 Accelerate
,我想为此 class 实现它,因为它可能比当前实现更快。
我决定实施vDSP_maxviD
。 https://developer.apple.com/documentation/accelerate/1449682-vdsp_maxvid?language=objc
他们的实现是这样写的:
__A Single-precision real input vector.
__I Stride for A
__C Output scalar
__IC Output scalar index
__N The number of elements to process
以及它们的底层 Accelerate
代码:
*C = -INFINITY;
for (n = 0; n < N; ++n)
{
if (*C < A[n * I])
{
*C = A[n * I];
*IC = n * I;
}
}
现在我开始思考这种情况下的参数需要变成什么:
__A
会变成pointer
__C
会变成highest
__IC
会变成highestClass
__N
会变成channels
但是,我将 __I
留空了。这是因为我不知道如何找到正确的 pointer
地址。我以前使用的代码:
unsigned offset = c * cStride + h * hStride + w * wStride;
然而,我们现在错过了 c
,它在 Accelerate
框架中被 n
取代了。
我的 pointer
索引的正确计算是什么?
编辑:删除了不必要的循环
我会尝试这样的事情(未测试):
uint32_t cStride = multiArray.strides[2].intValue;
uint16_t hStride = multiArray.strides[3].intValue;
uint8_t wStride = multiArray.strides[4].intValue;
for (uint16_t h = 0; h < height; h++) {
for (uint16_t w = 0; w < width; w++) {
uint8_t highestClass = 0;
double highest = -DBL_MAX;
//not sure about the type
double * tempPointer = &pointer[h * hStride + w * wStride];
vDSP_maxviD(tempPointer, cStride, &highest, &highestClass, channels);
// further code
}
}
问题是由于使用 pointer
作为 vDSP_maxviD
的第一个参数而忽略了 h
和 w
依赖偏移量。
尝试:
double *channels_pointer = &pointer[h * hStride + w * wStride];
vDSP_maxviD(channels_pointer, cStride, &highest, &highestClass, channels);
感谢 Grzegorz Owsiany 最终解决方案是:
double *channels_pointer = &pointer[h * hStride + w * wStride];
vDSP_maxviD(channels_pointer, cStride, &highest, &highestClass, channels);
不过,我还需要做:
highestClass /= cStride;