Lisp 中的数组与列表:为什么下面的代码中的列表要快得多?
Arrays vs. lists in Lisp: Why are lists so much faster in the code below?
我在求解 Problem 75 in Project Euler 时得到了意想不到的结果。我的代码确实找到了正确的解决方案,但它的行为很奇怪。
我的解决方案包括遍历一棵毕达哥拉斯树 (Barning's matrices) 直到达到周长限制,计算周长采用每个值的次数,最后计算只出现一次的周长.我公认的不整洁但功能正常的代码是:
(defparameter *barning-matrixes*
'(#(1 -2 2) #(2 -1 2) #(2 -2 3)
#(1 2 2) #(2 1 2) #(2 2 3)
#(-1 2 2) #(-2 1 2) #(-2 2 3)))
(defparameter *lengths* (make-array 1500001 :initial-element 0))
(defun expand-node (n)
"Takes a primitive Pythagorean triple in a vector and traverses subsequent nodes in the the tree of primitives until perimeter > 1,500,000"
(let ((perimeter (reduce #'+ n)))
(unless (> perimeter 1500000)
(let ((next-nodes (mapcar #'(lambda (x)
(reduce #'+ (map 'vector #'* n x))) *barning-matrixes*)))
(loop for i from perimeter to 1500000 by perimeter
do (incf (aref *lengths* i)))
(expand-node (subseq next-nodes 0 3))
(expand-node (subseq next-nodes 3 6))
(expand-node (subseq next-nodes 6 9))))))
(expand-node #(3 4 5)) ; Takes too darn long :-(
(count 1 *lengths*)
我预计树会在几毫秒内扩展到 运行,但是扩展节点函数花了 8.65 秒——比预期的要长很多——遍历一个不太大的树。
然而,当我调整代码以删除向量时,我感到很惊讶...
(defparameter *barning-matrixes*
'((1 -2 2) (2 -1 2) (2 -2 3)
(1 2 2) (2 1 2) (2 2 3)
(-1 2 2) (-2 1 2) (-2 2 3)))
(defparameter *lengths* (make-array 1500001 :initial-element 0))
(defun expand-node (n)
"Takes a primitive Pythagorean triple in a list and traverses subsequent nodes in the the tree of primitives until perimeter > 1,500,000"
(let ((perimeter (reduce #'+ n)))
(unless (> perimeter 1500000)
(let ((next-nodes (mapcar #'(lambda (x) (reduce #'+ (mapcar #'* n x))) *barning-matrixes*)))
(loop for i from perimeter to 1500000 by perimeter
do (incf (aref *lengths* i)))
(expand-node (subseq next-nodes 0 3))
(expand-node (subseq next-nodes 3 6))
(expand-node (subseq next-nodes 6 9))))))
(expand-node '(3 4 5)) ; Much faster, but why?!
(count 1 *lengths*)
...并且遍历速度大大加快,仅用了 35 毫秒。我对这种巨大的差异很感兴趣,希望有人能解释为什么会这样。
谢谢,
保罗
PS:我用的是 CCL。
你没有说你使用的是哪个实现。
你需要找出时间花在了哪里。
但对我来说,在您的 Common Lisp 中实现 MAP
列表和与新向量等长的向量似乎效率很低。
即使在使用一个有一些开销的新向量时,实现也会快得多。
尝试将向量运算实现为一个循环并比较:
(loop with v = (make-array (length n))
for n1 across n
for x1 across x
for i from 0
do (setf (aref v i) (* n1 x1))
finally (return v))
这个更快的版本也包含在内,但是用向量操作代替了列表操作:
(defparameter *barning-matrixes*
#(#(1 -2 2) #(2 -1 2) #(2 -2 3) #(1 2 2) #(2 1 2) #(2 2 3) #(-1 2 2) #(-2 1 2) #(-2 2 3)))
(defparameter *lengths* (make-array 1500001 :initial-element 0))
(defun expand-node (n)
"Takes a primitive Pythagorean triple in a vector and traverses subsequent nodes in the the tree of primitives until perimeter > 1,500,000"
(let ((perimeter (reduce #'+ n)))
(unless (> perimeter 1500000)
(let ((next-nodes
(loop with v = (make-array (length *barning-matrixes*))
for e across *barning-matrixes*
for i from 0
do (setf (aref v i)
(reduce #'+
(loop with v = (make-array (length n))
for n1 across n
for x1 across e
for i from 0
do (setf (aref v i) (* n1 x1))
finally (return v))))
finally (return v))))
(loop for i from perimeter to 1500000 by perimeter
do (incf (aref *lengths* i)))
(expand-node (subseq next-nodes 0 3))
(expand-node (subseq next-nodes 3 6))
(expand-node (subseq next-nodes 6 9))))))
(time (expand-node #(3 4 5)))
让我们看看你的代码:
(defun expand-node (n)
; here we don't know of which type N is. You call it from the toplevel
; with a vector, but recursive calls call it with a list
"Takes a primitive Pythagorean triple in a vector and traverses
subsequent nodes in the the tree of primitives until perimeter > 1,500,000"
(let ((perimeter (reduce #'+ n)))
(unless (> perimeter 1500000)
(let ((next-nodes (mapcar #'(lambda (x) ; this mapcar creates a list
(reduce #'+
(map 'vector
#'*
n ; <- list or vector
x))) ; <- vector
*barning-matrixes*)))
(loop for i from perimeter to 1500000 by perimeter
do (incf (aref *lengths* i)))
(expand-node (subseq next-nodes 0 3)) ; this subseq returns a list most of the times...
(expand-node (subseq next-nodes 3 6))
(expand-node (subseq next-nodes 6 9))))))
所以大多数时候你用一个列表和一个向量调用 MAP
。
结果向量的大小是多少? MAP 必须通过遍历列表或其他方式找出答案。结果向量长度是参数序列长度中最短的。然后它必须遍历列表和向量。如果 MAP 现在使用通用序列操作,则访问列表的元素总是遍历列表。显然,可以编写一个优化版本,它可以更快地完成所有操作,但是 Common Lisp 实现可能会选择仅提供 MAP 的通用实现...
在 SBCL 上检查向量 #(1 2 3) 给出:
Dimensions: (3)
Element type: T
Total size: 3
Adjustable: NIL
Fill pointer: NIL
Contents:
0: 1
1: 2
2: 3
您可以看到要存储的数据比列表中要多一点,尽管向量的确切内部表示因实现而异。对于像您的示例中那样不断复制的小向量,您最终可能会分配比列表更多的内存,这在下面的 bytes consed 行中可见。分配内存有助于 运行 时间。在我的测试中,请注意时间差异没有您的测试大。
;; VECTORS
(time (expand-node #(3 4 5)))
;; Evaluation took:
;; 2.060 seconds of real time
;; 2.062500 seconds of total run time (1.765625 user, 0.296875 system)
;; [ Run times consist of 0.186 seconds GC time, and 1.877 seconds non-GC time. ]
;; 100.10% CPU
;; 4,903,137,055 processor cycles
;; 202,276,032 bytes consed
;; LISTS
(time (expand-node* '(3 4 5)))
;; Evaluation took:
;; 0.610 seconds of real time
;; 0.609375 seconds of total run time (0.609375 user, 0.000000 system)
;; [ Run times consist of 0.016 seconds GC time, and 0.594 seconds non-GC time. ]
;; 99.84% CPU
;; 1,432,603,387 processor cycles
;; 80,902,560 bytes consed
欢迎来到复杂的 Common Lisp 优化!
首先要注意的是不同实现执行的不同程序优化策略:我在 SBCL 中尝试了您的示例,它们几乎在同一时间执行得非常高效,而在 CCL 中,向量版本的执行速度比列表版本。我不知道你尝试过哪种实现,但你可以尝试使用不同的实现来查看非常不同的执行时间。
从 CCL 中的一些测试来看,在我看来,主要问题来自于这种形式:
(map 'vector #'* n x)
执行速度比相应的列表版本慢得多:
(mapcar #'* n x)
用time
我看到矢量版conses很多
通过使用辅助向量将 map
简单地更改为 map-into
,这一第一印象得到了证实。实际上,CCL 中的以下版本比列表版本稍快:
(defun expand-node (n)
"Takes a primitive Pythagorean triple in a vector and traverses subsequent nodes in the the tree of primitives until perimeter > 1,500,000"
(let ((perimeter (reduce #'+ n))
(temp-vector (make-array 3 :initial-element 0)))
(unless (> perimeter 1500000)
(let ((next-nodes (mapcar #'(lambda (x)
(reduce #'+ (map-into temp-vector #'* n x))) *barning-matrixes*)))
(loop for i from perimeter to 1500000 by perimeter
do (incf (aref *lengths* i)))
(expand-node (subseq next-nodes 0 3))
(expand-node (subseq next-nodes 3 6))
(expand-node (subseq next-nodes 6 9))))))
我在优化代码的时候大家已经回答过了,所以我就把这个版本放在这里,不多做解释了。它应该 运行 相当快,至少在 SBCL 上是这样。
(declaim (optimize (speed 3) (safety 0) (debug 0)))
(declaim (type (simple-array (simple-array fixnum 1) 1) *barning-matrixes*))
(defparameter *barning-matrixes*
(map '(simple-array (simple-array fixnum 1) 1)
(lambda (list)
(make-array 3 :element-type 'fixnum
:initial-contents list))
'((1 -2 2) (2 -1 2) (2 -2 3)
(1 2 2) (2 1 2) (2 2 3)
(-1 2 2) (-2 1 2) (-2 2 3))))
(declaim (type (simple-array fixnum 1) *lengths*))
(defparameter *lengths* (make-array 1500001 :element-type 'fixnum
:initial-element 0))
(declaim (ftype (function ((simple-array fixnum 1))) expand-node))
(defun expand-node (n)
"Takes a primitive Pythagorean triple in a vector and traverses subsequent nodes in the the tree of primitives until perimeter > 1,500,000"
(loop with list-of-ns = (list n)
for n = (pop list-of-ns)
while n
do (let ((perimeter (let ((result 0))
(declare (type fixnum result))
(dotimes (i (length n) result)
(incf result (aref n i))))))
(declare (type fixnum perimeter))
(unless (> perimeter 1500000)
(let ((next-nodes
(let ((result (list)))
(dotimes (matrix 9 (nreverse result))
(let ((matrix (aref *barning-matrixes* matrix)))
(push (let ((result 0))
(declare (type fixnum result))
(dotimes (i 3 result)
(incf result
(the fixnum
(* (the fixnum (aref matrix i))
(the fixnum (aref n i)))))))
result))))))
(declare (type list next-nodes))
(loop for i from perimeter to 1500000 by perimeter
do (incf (aref *lengths* i)))
(dotimes (i 3)
(push (make-array 3 :element-type 'fixnum
:initial-contents (list (pop next-nodes)
(pop next-nodes)
(pop next-nodes)))
list-of-ns))))))
(values))
在我的慢笔记本电脑上,
CL-USER> (load (compile-file #P"e75.lisp"))
; ...compilation notes...
CL-USER> (time (expand-node (make-array 3 :element-type 'fixnum
:initial-contents '(3 4 5))))
Evaluation took:
0.274 seconds of real time
0.264000 seconds of total run time (0.264000 user, 0.000000 system)
96.35% CPU
382,768,596 processor cycles
35,413,600 bytes consed
; No values
CL-USER> (count 1 *lengths*)
161667 (18 bits, #x27783)
原始代码 运行 使用向量大约需要 1.8 秒,使用列表大约需要 0.8 秒。
我在求解 Problem 75 in Project Euler 时得到了意想不到的结果。我的代码确实找到了正确的解决方案,但它的行为很奇怪。
我的解决方案包括遍历一棵毕达哥拉斯树 (Barning's matrices) 直到达到周长限制,计算周长采用每个值的次数,最后计算只出现一次的周长.我公认的不整洁但功能正常的代码是:
(defparameter *barning-matrixes*
'(#(1 -2 2) #(2 -1 2) #(2 -2 3)
#(1 2 2) #(2 1 2) #(2 2 3)
#(-1 2 2) #(-2 1 2) #(-2 2 3)))
(defparameter *lengths* (make-array 1500001 :initial-element 0))
(defun expand-node (n)
"Takes a primitive Pythagorean triple in a vector and traverses subsequent nodes in the the tree of primitives until perimeter > 1,500,000"
(let ((perimeter (reduce #'+ n)))
(unless (> perimeter 1500000)
(let ((next-nodes (mapcar #'(lambda (x)
(reduce #'+ (map 'vector #'* n x))) *barning-matrixes*)))
(loop for i from perimeter to 1500000 by perimeter
do (incf (aref *lengths* i)))
(expand-node (subseq next-nodes 0 3))
(expand-node (subseq next-nodes 3 6))
(expand-node (subseq next-nodes 6 9))))))
(expand-node #(3 4 5)) ; Takes too darn long :-(
(count 1 *lengths*)
我预计树会在几毫秒内扩展到 运行,但是扩展节点函数花了 8.65 秒——比预期的要长很多——遍历一个不太大的树。
然而,当我调整代码以删除向量时,我感到很惊讶...
(defparameter *barning-matrixes*
'((1 -2 2) (2 -1 2) (2 -2 3)
(1 2 2) (2 1 2) (2 2 3)
(-1 2 2) (-2 1 2) (-2 2 3)))
(defparameter *lengths* (make-array 1500001 :initial-element 0))
(defun expand-node (n)
"Takes a primitive Pythagorean triple in a list and traverses subsequent nodes in the the tree of primitives until perimeter > 1,500,000"
(let ((perimeter (reduce #'+ n)))
(unless (> perimeter 1500000)
(let ((next-nodes (mapcar #'(lambda (x) (reduce #'+ (mapcar #'* n x))) *barning-matrixes*)))
(loop for i from perimeter to 1500000 by perimeter
do (incf (aref *lengths* i)))
(expand-node (subseq next-nodes 0 3))
(expand-node (subseq next-nodes 3 6))
(expand-node (subseq next-nodes 6 9))))))
(expand-node '(3 4 5)) ; Much faster, but why?!
(count 1 *lengths*)
...并且遍历速度大大加快,仅用了 35 毫秒。我对这种巨大的差异很感兴趣,希望有人能解释为什么会这样。
谢谢, 保罗
PS:我用的是 CCL。
你没有说你使用的是哪个实现。
你需要找出时间花在了哪里。
但对我来说,在您的 Common Lisp 中实现 MAP
列表和与新向量等长的向量似乎效率很低。
即使在使用一个有一些开销的新向量时,实现也会快得多。
尝试将向量运算实现为一个循环并比较:
(loop with v = (make-array (length n))
for n1 across n
for x1 across x
for i from 0
do (setf (aref v i) (* n1 x1))
finally (return v))
这个更快的版本也包含在内,但是用向量操作代替了列表操作:
(defparameter *barning-matrixes*
#(#(1 -2 2) #(2 -1 2) #(2 -2 3) #(1 2 2) #(2 1 2) #(2 2 3) #(-1 2 2) #(-2 1 2) #(-2 2 3)))
(defparameter *lengths* (make-array 1500001 :initial-element 0))
(defun expand-node (n)
"Takes a primitive Pythagorean triple in a vector and traverses subsequent nodes in the the tree of primitives until perimeter > 1,500,000"
(let ((perimeter (reduce #'+ n)))
(unless (> perimeter 1500000)
(let ((next-nodes
(loop with v = (make-array (length *barning-matrixes*))
for e across *barning-matrixes*
for i from 0
do (setf (aref v i)
(reduce #'+
(loop with v = (make-array (length n))
for n1 across n
for x1 across e
for i from 0
do (setf (aref v i) (* n1 x1))
finally (return v))))
finally (return v))))
(loop for i from perimeter to 1500000 by perimeter
do (incf (aref *lengths* i)))
(expand-node (subseq next-nodes 0 3))
(expand-node (subseq next-nodes 3 6))
(expand-node (subseq next-nodes 6 9))))))
(time (expand-node #(3 4 5)))
让我们看看你的代码:
(defun expand-node (n)
; here we don't know of which type N is. You call it from the toplevel
; with a vector, but recursive calls call it with a list
"Takes a primitive Pythagorean triple in a vector and traverses
subsequent nodes in the the tree of primitives until perimeter > 1,500,000"
(let ((perimeter (reduce #'+ n)))
(unless (> perimeter 1500000)
(let ((next-nodes (mapcar #'(lambda (x) ; this mapcar creates a list
(reduce #'+
(map 'vector
#'*
n ; <- list or vector
x))) ; <- vector
*barning-matrixes*)))
(loop for i from perimeter to 1500000 by perimeter
do (incf (aref *lengths* i)))
(expand-node (subseq next-nodes 0 3)) ; this subseq returns a list most of the times...
(expand-node (subseq next-nodes 3 6))
(expand-node (subseq next-nodes 6 9))))))
所以大多数时候你用一个列表和一个向量调用 MAP
。
结果向量的大小是多少? MAP 必须通过遍历列表或其他方式找出答案。结果向量长度是参数序列长度中最短的。然后它必须遍历列表和向量。如果 MAP 现在使用通用序列操作,则访问列表的元素总是遍历列表。显然,可以编写一个优化版本,它可以更快地完成所有操作,但是 Common Lisp 实现可能会选择仅提供 MAP 的通用实现...
在 SBCL 上检查向量 #(1 2 3) 给出:
Dimensions: (3)
Element type: T
Total size: 3
Adjustable: NIL
Fill pointer: NIL
Contents:
0: 1
1: 2
2: 3
您可以看到要存储的数据比列表中要多一点,尽管向量的确切内部表示因实现而异。对于像您的示例中那样不断复制的小向量,您最终可能会分配比列表更多的内存,这在下面的 bytes consed 行中可见。分配内存有助于 运行 时间。在我的测试中,请注意时间差异没有您的测试大。
;; VECTORS
(time (expand-node #(3 4 5)))
;; Evaluation took:
;; 2.060 seconds of real time
;; 2.062500 seconds of total run time (1.765625 user, 0.296875 system)
;; [ Run times consist of 0.186 seconds GC time, and 1.877 seconds non-GC time. ]
;; 100.10% CPU
;; 4,903,137,055 processor cycles
;; 202,276,032 bytes consed
;; LISTS
(time (expand-node* '(3 4 5)))
;; Evaluation took:
;; 0.610 seconds of real time
;; 0.609375 seconds of total run time (0.609375 user, 0.000000 system)
;; [ Run times consist of 0.016 seconds GC time, and 0.594 seconds non-GC time. ]
;; 99.84% CPU
;; 1,432,603,387 processor cycles
;; 80,902,560 bytes consed
欢迎来到复杂的 Common Lisp 优化! 首先要注意的是不同实现执行的不同程序优化策略:我在 SBCL 中尝试了您的示例,它们几乎在同一时间执行得非常高效,而在 CCL 中,向量版本的执行速度比列表版本。我不知道你尝试过哪种实现,但你可以尝试使用不同的实现来查看非常不同的执行时间。
从 CCL 中的一些测试来看,在我看来,主要问题来自于这种形式:
(map 'vector #'* n x)
执行速度比相应的列表版本慢得多:
(mapcar #'* n x)
用time
我看到矢量版conses很多
通过使用辅助向量将 map
简单地更改为 map-into
,这一第一印象得到了证实。实际上,CCL 中的以下版本比列表版本稍快:
(defun expand-node (n)
"Takes a primitive Pythagorean triple in a vector and traverses subsequent nodes in the the tree of primitives until perimeter > 1,500,000"
(let ((perimeter (reduce #'+ n))
(temp-vector (make-array 3 :initial-element 0)))
(unless (> perimeter 1500000)
(let ((next-nodes (mapcar #'(lambda (x)
(reduce #'+ (map-into temp-vector #'* n x))) *barning-matrixes*)))
(loop for i from perimeter to 1500000 by perimeter
do (incf (aref *lengths* i)))
(expand-node (subseq next-nodes 0 3))
(expand-node (subseq next-nodes 3 6))
(expand-node (subseq next-nodes 6 9))))))
我在优化代码的时候大家已经回答过了,所以我就把这个版本放在这里,不多做解释了。它应该 运行 相当快,至少在 SBCL 上是这样。
(declaim (optimize (speed 3) (safety 0) (debug 0)))
(declaim (type (simple-array (simple-array fixnum 1) 1) *barning-matrixes*))
(defparameter *barning-matrixes*
(map '(simple-array (simple-array fixnum 1) 1)
(lambda (list)
(make-array 3 :element-type 'fixnum
:initial-contents list))
'((1 -2 2) (2 -1 2) (2 -2 3)
(1 2 2) (2 1 2) (2 2 3)
(-1 2 2) (-2 1 2) (-2 2 3))))
(declaim (type (simple-array fixnum 1) *lengths*))
(defparameter *lengths* (make-array 1500001 :element-type 'fixnum
:initial-element 0))
(declaim (ftype (function ((simple-array fixnum 1))) expand-node))
(defun expand-node (n)
"Takes a primitive Pythagorean triple in a vector and traverses subsequent nodes in the the tree of primitives until perimeter > 1,500,000"
(loop with list-of-ns = (list n)
for n = (pop list-of-ns)
while n
do (let ((perimeter (let ((result 0))
(declare (type fixnum result))
(dotimes (i (length n) result)
(incf result (aref n i))))))
(declare (type fixnum perimeter))
(unless (> perimeter 1500000)
(let ((next-nodes
(let ((result (list)))
(dotimes (matrix 9 (nreverse result))
(let ((matrix (aref *barning-matrixes* matrix)))
(push (let ((result 0))
(declare (type fixnum result))
(dotimes (i 3 result)
(incf result
(the fixnum
(* (the fixnum (aref matrix i))
(the fixnum (aref n i)))))))
result))))))
(declare (type list next-nodes))
(loop for i from perimeter to 1500000 by perimeter
do (incf (aref *lengths* i)))
(dotimes (i 3)
(push (make-array 3 :element-type 'fixnum
:initial-contents (list (pop next-nodes)
(pop next-nodes)
(pop next-nodes)))
list-of-ns))))))
(values))
在我的慢笔记本电脑上,
CL-USER> (load (compile-file #P"e75.lisp"))
; ...compilation notes...
CL-USER> (time (expand-node (make-array 3 :element-type 'fixnum
:initial-contents '(3 4 5))))
Evaluation took:
0.274 seconds of real time
0.264000 seconds of total run time (0.264000 user, 0.000000 system)
96.35% CPU
382,768,596 processor cycles
35,413,600 bytes consed
; No values
CL-USER> (count 1 *lengths*)
161667 (18 bits, #x27783)
原始代码 运行 使用向量大约需要 1.8 秒,使用列表大约需要 0.8 秒。