Intel MKL ERROR: incorrect parameter when calling gemm()

Intel MKL ERROR: incorrect parameter when calling gemm()

我有这个代码:

void my_function(double *image_vector, double *endmembers, double *abundanceVector, int it, int lines, int samples, int bands, int targets)
{
    double *h_Num;
    double *h_aux;
    double *h_Den;
    int lines_samples = lines*samples;
        
    h_Num = (double*) malloc(lines_samples * targets * sizeof(double));
    h_aux = (double*) malloc(lines_samples * bands * sizeof(double));
    h_Den = (double*) malloc(lines_samples * targets * sizeof(double));

    sycl::queue my_queue{sycl::default_selector{}};

        std::cout << "Device: "
                  << my_queue.get_device().get_info<sycl::info::device::name>()
                  << std::endl;
    
    // USM declaration
    double* image_vector_usm = sycl::malloc_shared<double>(lines_samples*bands, my_queue);
    double* endmembers_usm = sycl::malloc_shared<double>(targets*bands, my_queue);
    double* abundanceVector_usm = sycl::malloc_shared<double>(lines_samples*targets, my_queue); 
    double* h_Num_usm = sycl::malloc_shared<double>(lines_samples*targets, my_queue);
    double* h_aux_usm = sycl::malloc_shared<double>(lines_samples*bands, my_queue);
    double* h_Den_usm = sycl::malloc_shared<double>(lines_samples*targets, my_queue);
    auto nonTrans = oneapi::mkl::transpose::nontrans;
    auto yesTrans = oneapi::mkl::transpose::trans;
    
    int i,j;
    
    // We copy the parameters values into the USM variables // Maybe the mistake is here?
    std::memcpy(image_vector_usm, image_vector,sizeof(double) * lines_samples*bands);
    std::memcpy(endmembers_usm, endmembers,sizeof(double) * targets*bands);
    
    // Initialization
    for(i=0; i<lines_samples*targets; i++)
        abundanceVector_usm[i]=1;

    double alpha = 1.0;
    double beta = 0.0;

    // Start of callings to dgemm()

      oneapi::mkl::blas::row_major::gemm(my_queue, nonTrans, yesTrans, lines_samples, targets, bands, alpha, image_vector_usm,lines_samples, endmembers_usm, targets, beta, h_Num_usm, lines_samples);

    my_queue.wait_and_throw();

    for(i=0; i<it; i++)
    { 
        oneapi::mkl::blas::row_major::gemm(my_queue, nonTrans, nonTrans, lines_samples, targets, bands, alpha, abundanceVector_usm, lines_samples, endmembers_usm, targets, beta, h_aux_usm, lines_samples);
        
        my_queue.wait_and_throw();

        oneapi::mkl::blas::row_major::gemm(my_queue, nonTrans, yesTrans, lines_samples, targets, bands, alpha,h_aux_usm, lines_samples, endmembers_usm, targets, beta, h_Den_usm, lines_samples);

        my_queue.wait_and_throw();

        my_queue.parallel_for(sycl::range<1> (lines_samples*targets), [=] (sycl::id<1> j){
            abundanceVector_usm[j] = abundanceVector_usm[j]*(h_Num_usm[j]/h_Den_usm[j]);
        }).wait();
    }

    free(h_Den);
    free(h_Num);
    free(h_aux);
    
    // Free SYCL
    free(image_vector_usm, my_queue);
    free(endmembers_usm, my_queue);
    free(abundanceVector_usm, my_queue);
    free(h_Num_usm, my_queue);
    free(h_aux_usm, my_queue);
    free(h_Den_usm, my_queue);
}

这是 makefile,我从名为“matrix_mul_mkl”的默认 oneMKL 示例中借用它,并将其改编为我的文件名。生成文件称为 GNUmakefile:

# Makefile for GNU Make

default: run

all: run

run: my_code

MKL_COPTS = -DMKL_ILP64  -I"${MKLROOT}/include"
MKL_LIBS = -L${MKLROOT}/lib/intel64 -lmkl_sycl -lmkl_intel_ilp64 -lmkl_sequential -lmkl_core -lsycl -lOpenCL -lpthread -lm -ldl

DPCPP_OPTS = $(MKL_COPTS) -fsycl-device-code-split=per_kernel $(MKL_LIBS)

my_code: my_code.cpp RS_algorithm.cpp # This RS file is also needed to compile, nothing strange there I believe, completely sequential and just calls the function in my_code.
    dpcpp $^ -o $@ $(DPCPP_OPTS)


clean:
    -rm -f my_code

.PHONY: clean run all

我知道 ILP64 或 LP64 库有时会出现问题,但上面提到的 matrix_mul 示例有效,所以这不是对的吗?

这就是执行的内容 returns:

Device: Intel whatever model...
Intel MKL ERROR: Parameter 11 was incorrect on entry to cblas_dgemm.
Segmentation fault.

我已经在 gemm() 的调用下面放置了一些打印件并进行了一些测试;第一个调用似乎执行了,但第二个调用没有执行。

我已经尝试并检查了一切,有什么问题吗?

提前致谢!

默认情况下,大多数编译器将整数('int' 用于 C 或 C++ / 'INTEGER' 用于 Fortran)作为 32 位长度。所以大多数应用程序都需要与 LP64 MKL 库链接。 (https://www.intel.com/content/www/us/en/develop/documentation/onemkl-linux-developer-guide/top/linking-your-application-with-onemkl/linking-in-detail/linking-with-interface-libraries/using-the-ilp64-interface-vs-lp64-interface.html)

所以尝试链接到 LP64 接口,看看它是否有效。 另外,我建议你设置 MKL_VERBOSE=1 (https://www.intel.com/content/www/us/en/develop/documentation/onemkl-linux-developer-guide/top/managing-output/using-onemkl-verbose-mode.html) 然后 运行 您的代码,以便您可以查看传递给函数的参数(如您的错误消息所述)。

你也可以参考自带的例子oneMKL.There是你系统mkl目录下类似的例子如下\oneAPI\mkl22.0.2\examples\examples_dpcpp\dpcpp\blas\source和usm_gemm.cpp 我想应该对你有帮助的文件名。

我找到了解决方案。我使用的是 row_major 版本的 gemm 调用,我必须为此代码调用 column_major 版本,小心!