Intel MKL ERROR: incorrect parameter when calling gemm()
Intel MKL ERROR: incorrect parameter when calling gemm()
我有这个代码:
void my_function(double *image_vector, double *endmembers, double *abundanceVector, int it, int lines, int samples, int bands, int targets)
{
double *h_Num;
double *h_aux;
double *h_Den;
int lines_samples = lines*samples;
h_Num = (double*) malloc(lines_samples * targets * sizeof(double));
h_aux = (double*) malloc(lines_samples * bands * sizeof(double));
h_Den = (double*) malloc(lines_samples * targets * sizeof(double));
sycl::queue my_queue{sycl::default_selector{}};
std::cout << "Device: "
<< my_queue.get_device().get_info<sycl::info::device::name>()
<< std::endl;
// USM declaration
double* image_vector_usm = sycl::malloc_shared<double>(lines_samples*bands, my_queue);
double* endmembers_usm = sycl::malloc_shared<double>(targets*bands, my_queue);
double* abundanceVector_usm = sycl::malloc_shared<double>(lines_samples*targets, my_queue);
double* h_Num_usm = sycl::malloc_shared<double>(lines_samples*targets, my_queue);
double* h_aux_usm = sycl::malloc_shared<double>(lines_samples*bands, my_queue);
double* h_Den_usm = sycl::malloc_shared<double>(lines_samples*targets, my_queue);
auto nonTrans = oneapi::mkl::transpose::nontrans;
auto yesTrans = oneapi::mkl::transpose::trans;
int i,j;
// We copy the parameters values into the USM variables // Maybe the mistake is here?
std::memcpy(image_vector_usm, image_vector,sizeof(double) * lines_samples*bands);
std::memcpy(endmembers_usm, endmembers,sizeof(double) * targets*bands);
// Initialization
for(i=0; i<lines_samples*targets; i++)
abundanceVector_usm[i]=1;
double alpha = 1.0;
double beta = 0.0;
// Start of callings to dgemm()
oneapi::mkl::blas::row_major::gemm(my_queue, nonTrans, yesTrans, lines_samples, targets, bands, alpha, image_vector_usm,lines_samples, endmembers_usm, targets, beta, h_Num_usm, lines_samples);
my_queue.wait_and_throw();
for(i=0; i<it; i++)
{
oneapi::mkl::blas::row_major::gemm(my_queue, nonTrans, nonTrans, lines_samples, targets, bands, alpha, abundanceVector_usm, lines_samples, endmembers_usm, targets, beta, h_aux_usm, lines_samples);
my_queue.wait_and_throw();
oneapi::mkl::blas::row_major::gemm(my_queue, nonTrans, yesTrans, lines_samples, targets, bands, alpha,h_aux_usm, lines_samples, endmembers_usm, targets, beta, h_Den_usm, lines_samples);
my_queue.wait_and_throw();
my_queue.parallel_for(sycl::range<1> (lines_samples*targets), [=] (sycl::id<1> j){
abundanceVector_usm[j] = abundanceVector_usm[j]*(h_Num_usm[j]/h_Den_usm[j]);
}).wait();
}
free(h_Den);
free(h_Num);
free(h_aux);
// Free SYCL
free(image_vector_usm, my_queue);
free(endmembers_usm, my_queue);
free(abundanceVector_usm, my_queue);
free(h_Num_usm, my_queue);
free(h_aux_usm, my_queue);
free(h_Den_usm, my_queue);
}
这是 makefile,我从名为“matrix_mul_mkl”的默认 oneMKL 示例中借用它,并将其改编为我的文件名。生成文件称为 GNUmakefile:
# Makefile for GNU Make
default: run
all: run
run: my_code
MKL_COPTS = -DMKL_ILP64 -I"${MKLROOT}/include"
MKL_LIBS = -L${MKLROOT}/lib/intel64 -lmkl_sycl -lmkl_intel_ilp64 -lmkl_sequential -lmkl_core -lsycl -lOpenCL -lpthread -lm -ldl
DPCPP_OPTS = $(MKL_COPTS) -fsycl-device-code-split=per_kernel $(MKL_LIBS)
my_code: my_code.cpp RS_algorithm.cpp # This RS file is also needed to compile, nothing strange there I believe, completely sequential and just calls the function in my_code.
dpcpp $^ -o $@ $(DPCPP_OPTS)
clean:
-rm -f my_code
.PHONY: clean run all
我知道 ILP64 或 LP64 库有时会出现问题,但上面提到的 matrix_mul 示例有效,所以这不是对的吗?
这就是执行的内容 returns:
Device: Intel whatever model...
Intel MKL ERROR: Parameter 11 was incorrect on entry to cblas_dgemm.
Segmentation fault.
我已经在 gemm() 的调用下面放置了一些打印件并进行了一些测试;第一个调用似乎执行了,但第二个调用没有执行。
我已经尝试并检查了一切,有什么问题吗?
提前致谢!
默认情况下,大多数编译器将整数('int' 用于 C 或 C++ / 'INTEGER' 用于 Fortran)作为 32 位长度。所以大多数应用程序都需要与 LP64 MKL 库链接。
(https://www.intel.com/content/www/us/en/develop/documentation/onemkl-linux-developer-guide/top/linking-your-application-with-onemkl/linking-in-detail/linking-with-interface-libraries/using-the-ilp64-interface-vs-lp64-interface.html)
所以尝试链接到 LP64 接口,看看它是否有效。
另外,我建议你设置 MKL_VERBOSE=1
(https://www.intel.com/content/www/us/en/develop/documentation/onemkl-linux-developer-guide/top/managing-output/using-onemkl-verbose-mode.html)
然后 运行 您的代码,以便您可以查看传递给函数的参数(如您的错误消息所述)。
你也可以参考自带的例子oneMKL.There是你系统mkl目录下类似的例子如下\oneAPI\mkl22.0.2\examples\examples_dpcpp\dpcpp\blas\source和usm_gemm.cpp
我想应该对你有帮助的文件名。
我找到了解决方案。我使用的是 row_major 版本的 gemm 调用,我必须为此代码调用 column_major 版本,小心!
我有这个代码:
void my_function(double *image_vector, double *endmembers, double *abundanceVector, int it, int lines, int samples, int bands, int targets)
{
double *h_Num;
double *h_aux;
double *h_Den;
int lines_samples = lines*samples;
h_Num = (double*) malloc(lines_samples * targets * sizeof(double));
h_aux = (double*) malloc(lines_samples * bands * sizeof(double));
h_Den = (double*) malloc(lines_samples * targets * sizeof(double));
sycl::queue my_queue{sycl::default_selector{}};
std::cout << "Device: "
<< my_queue.get_device().get_info<sycl::info::device::name>()
<< std::endl;
// USM declaration
double* image_vector_usm = sycl::malloc_shared<double>(lines_samples*bands, my_queue);
double* endmembers_usm = sycl::malloc_shared<double>(targets*bands, my_queue);
double* abundanceVector_usm = sycl::malloc_shared<double>(lines_samples*targets, my_queue);
double* h_Num_usm = sycl::malloc_shared<double>(lines_samples*targets, my_queue);
double* h_aux_usm = sycl::malloc_shared<double>(lines_samples*bands, my_queue);
double* h_Den_usm = sycl::malloc_shared<double>(lines_samples*targets, my_queue);
auto nonTrans = oneapi::mkl::transpose::nontrans;
auto yesTrans = oneapi::mkl::transpose::trans;
int i,j;
// We copy the parameters values into the USM variables // Maybe the mistake is here?
std::memcpy(image_vector_usm, image_vector,sizeof(double) * lines_samples*bands);
std::memcpy(endmembers_usm, endmembers,sizeof(double) * targets*bands);
// Initialization
for(i=0; i<lines_samples*targets; i++)
abundanceVector_usm[i]=1;
double alpha = 1.0;
double beta = 0.0;
// Start of callings to dgemm()
oneapi::mkl::blas::row_major::gemm(my_queue, nonTrans, yesTrans, lines_samples, targets, bands, alpha, image_vector_usm,lines_samples, endmembers_usm, targets, beta, h_Num_usm, lines_samples);
my_queue.wait_and_throw();
for(i=0; i<it; i++)
{
oneapi::mkl::blas::row_major::gemm(my_queue, nonTrans, nonTrans, lines_samples, targets, bands, alpha, abundanceVector_usm, lines_samples, endmembers_usm, targets, beta, h_aux_usm, lines_samples);
my_queue.wait_and_throw();
oneapi::mkl::blas::row_major::gemm(my_queue, nonTrans, yesTrans, lines_samples, targets, bands, alpha,h_aux_usm, lines_samples, endmembers_usm, targets, beta, h_Den_usm, lines_samples);
my_queue.wait_and_throw();
my_queue.parallel_for(sycl::range<1> (lines_samples*targets), [=] (sycl::id<1> j){
abundanceVector_usm[j] = abundanceVector_usm[j]*(h_Num_usm[j]/h_Den_usm[j]);
}).wait();
}
free(h_Den);
free(h_Num);
free(h_aux);
// Free SYCL
free(image_vector_usm, my_queue);
free(endmembers_usm, my_queue);
free(abundanceVector_usm, my_queue);
free(h_Num_usm, my_queue);
free(h_aux_usm, my_queue);
free(h_Den_usm, my_queue);
}
这是 makefile,我从名为“matrix_mul_mkl”的默认 oneMKL 示例中借用它,并将其改编为我的文件名。生成文件称为 GNUmakefile:
# Makefile for GNU Make
default: run
all: run
run: my_code
MKL_COPTS = -DMKL_ILP64 -I"${MKLROOT}/include"
MKL_LIBS = -L${MKLROOT}/lib/intel64 -lmkl_sycl -lmkl_intel_ilp64 -lmkl_sequential -lmkl_core -lsycl -lOpenCL -lpthread -lm -ldl
DPCPP_OPTS = $(MKL_COPTS) -fsycl-device-code-split=per_kernel $(MKL_LIBS)
my_code: my_code.cpp RS_algorithm.cpp # This RS file is also needed to compile, nothing strange there I believe, completely sequential and just calls the function in my_code.
dpcpp $^ -o $@ $(DPCPP_OPTS)
clean:
-rm -f my_code
.PHONY: clean run all
我知道 ILP64 或 LP64 库有时会出现问题,但上面提到的 matrix_mul 示例有效,所以这不是对的吗?
这就是执行的内容 returns:
Device: Intel whatever model...
Intel MKL ERROR: Parameter 11 was incorrect on entry to cblas_dgemm.
Segmentation fault.
我已经在 gemm() 的调用下面放置了一些打印件并进行了一些测试;第一个调用似乎执行了,但第二个调用没有执行。
我已经尝试并检查了一切,有什么问题吗?
提前致谢!
默认情况下,大多数编译器将整数('int' 用于 C 或 C++ / 'INTEGER' 用于 Fortran)作为 32 位长度。所以大多数应用程序都需要与 LP64 MKL 库链接。 (https://www.intel.com/content/www/us/en/develop/documentation/onemkl-linux-developer-guide/top/linking-your-application-with-onemkl/linking-in-detail/linking-with-interface-libraries/using-the-ilp64-interface-vs-lp64-interface.html)
所以尝试链接到 LP64 接口,看看它是否有效。 另外,我建议你设置 MKL_VERBOSE=1 (https://www.intel.com/content/www/us/en/develop/documentation/onemkl-linux-developer-guide/top/managing-output/using-onemkl-verbose-mode.html) 然后 运行 您的代码,以便您可以查看传递给函数的参数(如您的错误消息所述)。
你也可以参考自带的例子oneMKL.There是你系统mkl目录下类似的例子如下\oneAPI\mkl22.0.2\examples\examples_dpcpp\dpcpp\blas\source和usm_gemm.cpp 我想应该对你有帮助的文件名。
我找到了解决方案。我使用的是 row_major 版本的 gemm 调用,我必须为此代码调用 column_major 版本,小心!