Clojure:输入流比 reader 慢
Clojure: input-stream slower than reader
我正在尝试从输入流中读取字节,它比使用 reader 读取字符要慢得多。我不明白为什么会这样。看测试:
(defn r1
[input]
(loop []
(when-not (= -1 (.read ^java.io.InputStream input))
(recur))))
(defn r2
[input]
(loop []
(when-not (.read input)
(recur))))
(dotimes [_ 10]
(time (with-open [is (clojure.java.io/input-stream "15mb.log")]
(r1 is))))
"Elapsed time: 111.608991 msecs"
"Elapsed time: 95.45663 msecs"
"Elapsed time: 148.789867 msecs"
"Elapsed time: 97.580527 msecs"
"Elapsed time: 113.093759 msecs"
"Elapsed time: 108.306019 msecs"
"Elapsed time: 107.71069 msecs"
"Elapsed time: 104.833343 msecs"
"Elapsed time: 174.701027 msecs"
"Elapsed time: 141.969629 msecs"
(dotimes [_ 10]
(time (with-open [r (clojure.java.io/reader "15mb.log")]
(r2 r))))
"Elapsed time: 0.635769 msecs"
"Elapsed time: 0.422315 msecs"
"Elapsed time: 0.355953 msecs"
"Elapsed time: 0.336128 msecs"
"Elapsed time: 0.333523 msecs"
"Elapsed time: 0.339613 msecs"
"Elapsed time: 0.329693 msecs"
"Elapsed time: 0.234213 msecs"
"Elapsed time: 0.209742 msecs"
"Elapsed time: 0.199334 msecs"
据我所知,clojure.java.io/input-stream 使用 BufferedInputStream 而 clojure.java.io/reader 使用 BufferedReader,因此没有理由在速度上有如此显着的差异。我错过了什么吗?
您的测试有缺陷。 BufferedReader
和 BufferedInputStream
return -1
都在流的末尾。所以,你对 r2
的测试也应该是 (when-not (= -1 (.read ...
.
虽然下面的测试方法不精确到非常小的毫秒级别,但对于这个测试来说已经足够准确了,并且使用非常好的 criterium 基准库进行 clojure 测试会产生类似的结果。更紧凑地再次发布测试,以方便 copy/paste:
(let [testfile "zerofile"] ; $ dd if=/dev/zero of=zerofile bs=1k count=1k
(map (fn [func label]
(println label)
(dotimes [_ 3]
(time (with-open [data (func testfile)]
(while (not= -1 (.read data)))))))
[clojure.java.io/input-stream, clojure.java.io/reader]
["Input Stream:" "\nReader:"]))
一个结果:
Input Stream:
"Elapsed time: 624.01494 msecs"
"Elapsed time: 650.407183 msecs"
"Elapsed time: 627.244097 msecs"
Reader:
"Elapsed time: 706.776733 msecs"
"Elapsed time: 691.887275 msecs"
"Elapsed time: 703.918226 msecs"
我正在尝试从输入流中读取字节,它比使用 reader 读取字符要慢得多。我不明白为什么会这样。看测试:
(defn r1
[input]
(loop []
(when-not (= -1 (.read ^java.io.InputStream input))
(recur))))
(defn r2
[input]
(loop []
(when-not (.read input)
(recur))))
(dotimes [_ 10]
(time (with-open [is (clojure.java.io/input-stream "15mb.log")]
(r1 is))))
"Elapsed time: 111.608991 msecs"
"Elapsed time: 95.45663 msecs"
"Elapsed time: 148.789867 msecs"
"Elapsed time: 97.580527 msecs"
"Elapsed time: 113.093759 msecs"
"Elapsed time: 108.306019 msecs"
"Elapsed time: 107.71069 msecs"
"Elapsed time: 104.833343 msecs"
"Elapsed time: 174.701027 msecs"
"Elapsed time: 141.969629 msecs"
(dotimes [_ 10]
(time (with-open [r (clojure.java.io/reader "15mb.log")]
(r2 r))))
"Elapsed time: 0.635769 msecs"
"Elapsed time: 0.422315 msecs"
"Elapsed time: 0.355953 msecs"
"Elapsed time: 0.336128 msecs"
"Elapsed time: 0.333523 msecs"
"Elapsed time: 0.339613 msecs"
"Elapsed time: 0.329693 msecs"
"Elapsed time: 0.234213 msecs"
"Elapsed time: 0.209742 msecs"
"Elapsed time: 0.199334 msecs"
据我所知,clojure.java.io/input-stream 使用 BufferedInputStream 而 clojure.java.io/reader 使用 BufferedReader,因此没有理由在速度上有如此显着的差异。我错过了什么吗?
您的测试有缺陷。 BufferedReader
和 BufferedInputStream
return -1
都在流的末尾。所以,你对 r2
的测试也应该是 (when-not (= -1 (.read ...
.
虽然下面的测试方法不精确到非常小的毫秒级别,但对于这个测试来说已经足够准确了,并且使用非常好的 criterium 基准库进行 clojure 测试会产生类似的结果。更紧凑地再次发布测试,以方便 copy/paste:
(let [testfile "zerofile"] ; $ dd if=/dev/zero of=zerofile bs=1k count=1k
(map (fn [func label]
(println label)
(dotimes [_ 3]
(time (with-open [data (func testfile)]
(while (not= -1 (.read data)))))))
[clojure.java.io/input-stream, clojure.java.io/reader]
["Input Stream:" "\nReader:"]))
一个结果:
Input Stream:
"Elapsed time: 624.01494 msecs"
"Elapsed time: 650.407183 msecs"
"Elapsed time: 627.244097 msecs"
Reader:
"Elapsed time: 706.776733 msecs"
"Elapsed time: 691.887275 msecs"
"Elapsed time: 703.918226 msecs"