Common Lisp 中的矩阵乘法
Matrix multiplication in Common Lisp
我正在用 CL(使用 SBCL 1.2.15)编写使用线性代数的程序。在执行的过程中,经常会把一个矩阵乘以一个向量。
Profiler 显示程序大部分时间 (80%) 都在做这个,矩阵乘以向量。它还显示此函数执行大量操作(80,000,000 次调用 100x100 矩阵约 100 次),这也会触发 GC。在 F# 中做了类似的事情(具有静态类型检查的优势,但通常不会优于 SBCL),F# 程序运行速度提高了 10 倍。
我做错了吗?
(defun matrix-mul (matrix vector dest)
"Multiply MATRIX by VECTOR putting the result into DEST.
Optimized for DOUBLE-FLOAT vectors and matrices"
(declare (type (array double-float (* *)) matrix)
(type (vector double-float *) vector dest)
(optimize (speed 3) (debug 0) (safety 0)))
(destructuring-bind (rows cols) (array-dimensions matrix)
(dotimes (i rows)
(setf (aref dest i)
(loop for j below cols sum (the double-float
(* (aref matrix i j) (aref vector j))))))))
PS。我也尝试过使用 DOTIMES 而不是内部循环 - 没有区别
PPS。下一个选项可以使用来自 CL 的 BLAS,但 (i) 我不期待让它在 Windows 上工作,(ii) 需要来回编组 matrices/vectors。
通常的问题是浮点运算,这里是双浮点数,(独立于周围的代码,如矩阵乘法)。
通常在这种情况下首先要处理 SBCL:
将代码放入文件并编译
考虑到编译速度,编译器会输出很多优化问题。然后你需要检查笔记,看看你能做什么。
例如这里的 LOOP
总和缺少类型信息。
实际上有一个LOOP
语法来声明和变量的类型。我不知道SBCL是否利用了这一点:
(loop repeat 10
sum 1.0d0 of-type double-float)
32 位 ARM 上的 SBCL 1.3.0 适用于您的代码:
* (compile-file "/tmp/test.lisp")
; compiling file "/tmp/test.lisp" (written 13 DEC 2015 11:34:26 AM):
; compiling (DEFUN MATRIX-MUL ...)
; file: /tmp/test.lisp
1)
; in: DEFUN MATRIX-MUL
; (SETF (AREF DEST I)
; (LOOP FOR J BELOW COLS
; SUM (THE DOUBLE-FLOAT (* # #))))
; --> LET* FUNCALL SB-C::%FUNCALL (SETF AREF)
; ==>
; (SB-KERNEL:HAIRY-DATA-VECTOR-SET ARRAY SB-INT:INDEX SB-C::NEW-VALUE)
;
; note: unable to
; avoid runtime dispatch on array element type
; due to type uncertainty:
; The first argument is a (VECTOR DOUBLE-FLOAT), not a SIMPLE-ARRAY.
2)
; (AREF MATRIX I J)
; --> LET*
; ==>
; (SB-KERNEL:HAIRY-DATA-VECTOR-REF ARRAY SB-INT:INDEX)
;
; note: unable to
; avoid runtime dispatch on array element type
; due to type uncertainty:
; The first argument is a (ARRAY DOUBLE-FLOAT (* *)), not a SIMPLE-ARRAY.
3)
; (AREF VECTOR J)
; ==>
; (SB-KERNEL:HAIRY-DATA-VECTOR-REF ARRAY SB-INT:INDEX)
;
; note: unable to
; avoid runtime dispatch on array element type
; due to type uncertainty:
; The first argument is a (VECTOR DOUBLE-FLOAT), not a SIMPLE-ARRAY.
4)
; (LOOP FOR J BELOW COLS
; SUM (THE DOUBLE-FLOAT (* (AREF MATRIX I J) (AREF VECTOR J))))
; --> BLOCK LET SB-LOOP::WITH-SUM-COUNT LET SB-LOOP::LOOP-BODY TAGBODY SETQ THE
; ==>
; (+ #:LOOP-SUM-8 (THE DOUBLE-FLOAT (* (AREF MATRIX I J) (AREF VECTOR J))))
;
; note: unable to
; optimize
; due to type uncertainty:
; The first argument is a NUMBER, not a (COMPLEX SINGLE-FLOAT).
;
; note: unable to
; optimize
; due to type uncertainty:
; The first argument is a NUMBER, not a (COMPLEX DOUBLE-FLOAT).
5)
; --> BLOCK LET SB-LOOP::WITH-SUM-COUNT LET SB-LOOP::LOOP-BODY TAGBODY WHEN IF
; --> >= OR LET IF OR THE = IF
; ==>
; (= SB-C::X SB-C::Y)
;
; note: unable to open code because: The operands might not be the same type.
6)
; (DOTIMES (I ROWS)
; (SETF (AREF DEST I)
; (LOOP FOR J BELOW COLS
; SUM (THE DOUBLE-FLOAT #))))
; --> DO BLOCK LET TAGBODY UNLESS IF >= IF
; ==>
; (< SB-C::X SB-C::Y)
;
; note: forced to do static-fun Two-arg-< (cost 53)
; unable to do inline fixnum comparison (cost 4) because:
; The second argument is a INTEGER, not a FIXNUM.
; unable to do inline (signed-byte 32) comparison (cost 6) because:
; The second argument is a INTEGER, not a (SIGNED-BYTE 32).
; etc.
7)
; (LOOP FOR J BELOW COLS
; SUM (THE DOUBLE-FLOAT (* (AREF MATRIX I J) (AREF VECTOR J))))
; --> BLOCK LET SB-LOOP::WITH-SUM-COUNT LET SB-LOOP::LOOP-BODY TAGBODY WHEN IF
; --> >= OR LET > IF
; ==>
; (> SB-C::X SB-C::Y)
;
; note: forced to do static-fun Two-arg-> (cost 53)
; unable to do inline fixnum comparison (cost 4) because:
; The second argument is a REAL, not a FIXNUM.
; unable to do inline (signed-byte 32) comparison (cost 6) because:
; The second argument is a REAL, not a (SIGNED-BYTE 32).
; etc.
8)
; --> BLOCK LET SB-LOOP::WITH-SUM-COUNT LET SB-LOOP::LOOP-BODY TAGBODY SETQ THE
; ==>
; (+ #:LOOP-SUM-8 (THE DOUBLE-FLOAT (* (AREF MATRIX I J) (AREF VECTOR J))))
;
; note: forced to do static-fun Two-arg-+ (cost 53)
; unable to do inline float arithmetic (cost 2) because:
; The first argument is a NUMBER, not a DOUBLE-FLOAT.
; The result is a (VALUES NUMBER &OPTIONAL), not a (VALUES DOUBLE-FLOAT
; &REST T).
;
; note: doing float to pointer coercion (cost 13), for:
; the second argument of static-fun Two-arg-+
;
; compilation unit finished
; printed 10 notes
对于当前的 sbcl (2.0.1),上面的代码仍然存在优化警告。
更改为:
double-float:
(defun matrix-mul (matrix vector dest)
"Multiply MATRIX by VECTOR putting the result into DEST.
Optimized for DOUBLE-FLOAT vectors and matrices"
(declare (type (simple-array double-float (* *)) matrix)
(type (simple-array double-float *) vector dest)
(optimize (speed 3) (debug 0) (safety 0)))
(destructuring-bind (rows cols) (array-dimensions matrix)
(declare (fixnum rows cols))
(dotimes (i rows)
(let ((sum 0.0d0))
(declare (double-float sum))
(setf (aref dest i)
(progn
(dotimes (j cols)
(incf sum (* (aref matrix i j) (aref vector j))))
sum)))))
dest)
single-float:
(defun matrix-mul (matrix vector dest)
"Multiply MATRIX by VECTOR putting the result into DEST.
Optimized for SINGLE-FLOAT vectors and matrices"
(declare (type (simple-array single-float (* *)) matrix)
(type (simple-array single-float *) vector dest)
(optimize (speed 3) (debug 0) (safety 0)))
(destructuring-bind (rows cols) (array-dimensions matrix)
(declare (fixnum rows cols))
(dotimes (i rows)
(let ((sum 0.0))
(declare (single-float sum))
(setf (aref dest i)
(progn
(dotimes (j cols)
(incf sum (* (aref matrix i j) (aref vector j))))
sum)))))
dest)
我正在用 CL(使用 SBCL 1.2.15)编写使用线性代数的程序。在执行的过程中,经常会把一个矩阵乘以一个向量。
Profiler 显示程序大部分时间 (80%) 都在做这个,矩阵乘以向量。它还显示此函数执行大量操作(80,000,000 次调用 100x100 矩阵约 100 次),这也会触发 GC。在 F# 中做了类似的事情(具有静态类型检查的优势,但通常不会优于 SBCL),F# 程序运行速度提高了 10 倍。
我做错了吗?
(defun matrix-mul (matrix vector dest)
"Multiply MATRIX by VECTOR putting the result into DEST.
Optimized for DOUBLE-FLOAT vectors and matrices"
(declare (type (array double-float (* *)) matrix)
(type (vector double-float *) vector dest)
(optimize (speed 3) (debug 0) (safety 0)))
(destructuring-bind (rows cols) (array-dimensions matrix)
(dotimes (i rows)
(setf (aref dest i)
(loop for j below cols sum (the double-float
(* (aref matrix i j) (aref vector j))))))))
PS。我也尝试过使用 DOTIMES 而不是内部循环 - 没有区别
PPS。下一个选项可以使用来自 CL 的 BLAS,但 (i) 我不期待让它在 Windows 上工作,(ii) 需要来回编组 matrices/vectors。
通常的问题是浮点运算,这里是双浮点数,(独立于周围的代码,如矩阵乘法)。
通常在这种情况下首先要处理 SBCL:
将代码放入文件并编译
考虑到编译速度,编译器会输出很多优化问题。然后你需要检查笔记,看看你能做什么。
例如这里的 LOOP
总和缺少类型信息。
实际上有一个LOOP
语法来声明和变量的类型。我不知道SBCL是否利用了这一点:
(loop repeat 10
sum 1.0d0 of-type double-float)
32 位 ARM 上的 SBCL 1.3.0 适用于您的代码:
* (compile-file "/tmp/test.lisp")
; compiling file "/tmp/test.lisp" (written 13 DEC 2015 11:34:26 AM):
; compiling (DEFUN MATRIX-MUL ...)
; file: /tmp/test.lisp
1)
; in: DEFUN MATRIX-MUL
; (SETF (AREF DEST I)
; (LOOP FOR J BELOW COLS
; SUM (THE DOUBLE-FLOAT (* # #))))
; --> LET* FUNCALL SB-C::%FUNCALL (SETF AREF)
; ==>
; (SB-KERNEL:HAIRY-DATA-VECTOR-SET ARRAY SB-INT:INDEX SB-C::NEW-VALUE)
;
; note: unable to
; avoid runtime dispatch on array element type
; due to type uncertainty:
; The first argument is a (VECTOR DOUBLE-FLOAT), not a SIMPLE-ARRAY.
2)
; (AREF MATRIX I J)
; --> LET*
; ==>
; (SB-KERNEL:HAIRY-DATA-VECTOR-REF ARRAY SB-INT:INDEX)
;
; note: unable to
; avoid runtime dispatch on array element type
; due to type uncertainty:
; The first argument is a (ARRAY DOUBLE-FLOAT (* *)), not a SIMPLE-ARRAY.
3)
; (AREF VECTOR J)
; ==>
; (SB-KERNEL:HAIRY-DATA-VECTOR-REF ARRAY SB-INT:INDEX)
;
; note: unable to
; avoid runtime dispatch on array element type
; due to type uncertainty:
; The first argument is a (VECTOR DOUBLE-FLOAT), not a SIMPLE-ARRAY.
4)
; (LOOP FOR J BELOW COLS
; SUM (THE DOUBLE-FLOAT (* (AREF MATRIX I J) (AREF VECTOR J))))
; --> BLOCK LET SB-LOOP::WITH-SUM-COUNT LET SB-LOOP::LOOP-BODY TAGBODY SETQ THE
; ==>
; (+ #:LOOP-SUM-8 (THE DOUBLE-FLOAT (* (AREF MATRIX I J) (AREF VECTOR J))))
;
; note: unable to
; optimize
; due to type uncertainty:
; The first argument is a NUMBER, not a (COMPLEX SINGLE-FLOAT).
;
; note: unable to
; optimize
; due to type uncertainty:
; The first argument is a NUMBER, not a (COMPLEX DOUBLE-FLOAT).
5)
; --> BLOCK LET SB-LOOP::WITH-SUM-COUNT LET SB-LOOP::LOOP-BODY TAGBODY WHEN IF
; --> >= OR LET IF OR THE = IF
; ==>
; (= SB-C::X SB-C::Y)
;
; note: unable to open code because: The operands might not be the same type.
6)
; (DOTIMES (I ROWS)
; (SETF (AREF DEST I)
; (LOOP FOR J BELOW COLS
; SUM (THE DOUBLE-FLOAT #))))
; --> DO BLOCK LET TAGBODY UNLESS IF >= IF
; ==>
; (< SB-C::X SB-C::Y)
;
; note: forced to do static-fun Two-arg-< (cost 53)
; unable to do inline fixnum comparison (cost 4) because:
; The second argument is a INTEGER, not a FIXNUM.
; unable to do inline (signed-byte 32) comparison (cost 6) because:
; The second argument is a INTEGER, not a (SIGNED-BYTE 32).
; etc.
7)
; (LOOP FOR J BELOW COLS
; SUM (THE DOUBLE-FLOAT (* (AREF MATRIX I J) (AREF VECTOR J))))
; --> BLOCK LET SB-LOOP::WITH-SUM-COUNT LET SB-LOOP::LOOP-BODY TAGBODY WHEN IF
; --> >= OR LET > IF
; ==>
; (> SB-C::X SB-C::Y)
;
; note: forced to do static-fun Two-arg-> (cost 53)
; unable to do inline fixnum comparison (cost 4) because:
; The second argument is a REAL, not a FIXNUM.
; unable to do inline (signed-byte 32) comparison (cost 6) because:
; The second argument is a REAL, not a (SIGNED-BYTE 32).
; etc.
8)
; --> BLOCK LET SB-LOOP::WITH-SUM-COUNT LET SB-LOOP::LOOP-BODY TAGBODY SETQ THE
; ==>
; (+ #:LOOP-SUM-8 (THE DOUBLE-FLOAT (* (AREF MATRIX I J) (AREF VECTOR J))))
;
; note: forced to do static-fun Two-arg-+ (cost 53)
; unable to do inline float arithmetic (cost 2) because:
; The first argument is a NUMBER, not a DOUBLE-FLOAT.
; The result is a (VALUES NUMBER &OPTIONAL), not a (VALUES DOUBLE-FLOAT
; &REST T).
;
; note: doing float to pointer coercion (cost 13), for:
; the second argument of static-fun Two-arg-+
;
; compilation unit finished
; printed 10 notes
对于当前的 sbcl (2.0.1),上面的代码仍然存在优化警告。
更改为:
double-float:
(defun matrix-mul (matrix vector dest)
"Multiply MATRIX by VECTOR putting the result into DEST.
Optimized for DOUBLE-FLOAT vectors and matrices"
(declare (type (simple-array double-float (* *)) matrix)
(type (simple-array double-float *) vector dest)
(optimize (speed 3) (debug 0) (safety 0)))
(destructuring-bind (rows cols) (array-dimensions matrix)
(declare (fixnum rows cols))
(dotimes (i rows)
(let ((sum 0.0d0))
(declare (double-float sum))
(setf (aref dest i)
(progn
(dotimes (j cols)
(incf sum (* (aref matrix i j) (aref vector j))))
sum)))))
dest)
single-float:
(defun matrix-mul (matrix vector dest)
"Multiply MATRIX by VECTOR putting the result into DEST.
Optimized for SINGLE-FLOAT vectors and matrices"
(declare (type (simple-array single-float (* *)) matrix)
(type (simple-array single-float *) vector dest)
(optimize (speed 3) (debug 0) (safety 0)))
(destructuring-bind (rows cols) (array-dimensions matrix)
(declare (fixnum rows cols))
(dotimes (i rows)
(let ((sum 0.0))
(declare (single-float sum))
(setf (aref dest i)
(progn
(dotimes (j cols)
(incf sum (* (aref matrix i j) (aref vector j))))
sum)))))
dest)