Stream.toList() 会比 Collectors.toList() 表现更好吗
Would Stream.toList() perform better than Collectors.toList()
JDK 正在引入 API Stream.toList()
with JDK-8180352。这是我试图将其性能与现有 Collectors.toList
:
进行比较的基准测试代码
@BenchmarkMode(Mode.All)
@Fork(1)
@State(Scope.Thread)
@Warmup(iterations = 20, time = 1, batchSize = 10000)
@Measurement(iterations = 20, time = 1, batchSize = 10000)
public class CollectorsVsStreamToList {
@Benchmark
public List<Integer> viaCollectors() {
return IntStream.range(1, 1000).boxed().collect(Collectors.toList());
}
@Benchmark
public List<Integer> viaStream() {
return IntStream.range(1, 1000).boxed().toList();
}
}
结果汇总如下:
Benchmark Mode Cnt Score Error Units
CollectorsVsStreamToList.viaCollectors thrpt 20 17.321 ± 0.583 ops/s
CollectorsVsStreamToList.viaStream thrpt 20 23.879 ± 1.682 ops/s
CollectorsVsStreamToList.viaCollectors avgt 20 0.057 ± 0.002 s/op
CollectorsVsStreamToList.viaStream avgt 20 0.040 ± 0.001 s/op
CollectorsVsStreamToList.viaCollectors sample 380 0.054 ± 0.001 s/op
CollectorsVsStreamToList.viaCollectors:viaCollectors·p0.00 sample 0.051 s/op
CollectorsVsStreamToList.viaCollectors:viaCollectors·p0.50 sample 0.054 s/op
CollectorsVsStreamToList.viaCollectors:viaCollectors·p0.90 sample 0.058 s/op
CollectorsVsStreamToList.viaCollectors:viaCollectors·p0.95 sample 0.058 s/op
CollectorsVsStreamToList.viaCollectors:viaCollectors·p0.99 sample 0.062 s/op
CollectorsVsStreamToList.viaCollectors:viaCollectors·p0.999 sample 0.068 s/op
CollectorsVsStreamToList.viaCollectors:viaCollectors·p0.9999 sample 0.068 s/op
CollectorsVsStreamToList.viaCollectors:viaCollectors·p1.00 sample 0.068 s/op
CollectorsVsStreamToList.viaStream sample 525 0.039 ± 0.001 s/op
CollectorsVsStreamToList.viaStream:viaStream·p0.00 sample 0.037 s/op
CollectorsVsStreamToList.viaStream:viaStream·p0.50 sample 0.038 s/op
CollectorsVsStreamToList.viaStream:viaStream·p0.90 sample 0.040 s/op
CollectorsVsStreamToList.viaStream:viaStream·p0.95 sample 0.042 s/op
CollectorsVsStreamToList.viaStream:viaStream·p0.99 sample 0.050 s/op
CollectorsVsStreamToList.viaStream:viaStream·p0.999 sample 0.051 s/op
CollectorsVsStreamToList.viaStream:viaStream·p0.9999 sample 0.051 s/op
CollectorsVsStreamToList.viaStream:viaStream·p1.00 sample 0.051 s/op
CollectorsVsStreamToList.viaCollectors ss 20 0.060 ± 0.007 s/op
CollectorsVsStreamToList.viaStream ss 20 0.043 ± 0.006 s/op
当然,领域专家的第一个问题是基准程序是否正确?测试 class 在 MacOS 上执行。如果需要任何进一步的详细信息,请告诉我。
后续,据我从读数中推断,Stream.toList
的平均时间、吞吐量和采样时间看起来比Collectors.toList
好。这样理解对吗?
Stream::toList
建立在 toArray
之上,而不是 collect
。 toArray
中有许多优化使其可能比收集更快,尽管这在很大程度上取决于细节。如果流管道(从源到最终的中间操作)是 SIZED
,则目标数组可以预先确定大小(而不是像 toList
收集器必须做的那样可能重新分配。)如果管道进一步 SUBSIZED
,那么并行执行不仅可以预先确定结果数组的大小,还可以计算精确的每个分片偏移量,因此每个子任务都可以将其结果放在正确的位置,从而无需将中间结果复制到最终结果中。
因此,根据详细信息,toList
可能比 collect
快得多。
JDK 正在引入 API Stream.toList()
with JDK-8180352。这是我试图将其性能与现有 Collectors.toList
:
@BenchmarkMode(Mode.All)
@Fork(1)
@State(Scope.Thread)
@Warmup(iterations = 20, time = 1, batchSize = 10000)
@Measurement(iterations = 20, time = 1, batchSize = 10000)
public class CollectorsVsStreamToList {
@Benchmark
public List<Integer> viaCollectors() {
return IntStream.range(1, 1000).boxed().collect(Collectors.toList());
}
@Benchmark
public List<Integer> viaStream() {
return IntStream.range(1, 1000).boxed().toList();
}
}
结果汇总如下:
Benchmark Mode Cnt Score Error Units
CollectorsVsStreamToList.viaCollectors thrpt 20 17.321 ± 0.583 ops/s
CollectorsVsStreamToList.viaStream thrpt 20 23.879 ± 1.682 ops/s
CollectorsVsStreamToList.viaCollectors avgt 20 0.057 ± 0.002 s/op
CollectorsVsStreamToList.viaStream avgt 20 0.040 ± 0.001 s/op
CollectorsVsStreamToList.viaCollectors sample 380 0.054 ± 0.001 s/op
CollectorsVsStreamToList.viaCollectors:viaCollectors·p0.00 sample 0.051 s/op
CollectorsVsStreamToList.viaCollectors:viaCollectors·p0.50 sample 0.054 s/op
CollectorsVsStreamToList.viaCollectors:viaCollectors·p0.90 sample 0.058 s/op
CollectorsVsStreamToList.viaCollectors:viaCollectors·p0.95 sample 0.058 s/op
CollectorsVsStreamToList.viaCollectors:viaCollectors·p0.99 sample 0.062 s/op
CollectorsVsStreamToList.viaCollectors:viaCollectors·p0.999 sample 0.068 s/op
CollectorsVsStreamToList.viaCollectors:viaCollectors·p0.9999 sample 0.068 s/op
CollectorsVsStreamToList.viaCollectors:viaCollectors·p1.00 sample 0.068 s/op
CollectorsVsStreamToList.viaStream sample 525 0.039 ± 0.001 s/op
CollectorsVsStreamToList.viaStream:viaStream·p0.00 sample 0.037 s/op
CollectorsVsStreamToList.viaStream:viaStream·p0.50 sample 0.038 s/op
CollectorsVsStreamToList.viaStream:viaStream·p0.90 sample 0.040 s/op
CollectorsVsStreamToList.viaStream:viaStream·p0.95 sample 0.042 s/op
CollectorsVsStreamToList.viaStream:viaStream·p0.99 sample 0.050 s/op
CollectorsVsStreamToList.viaStream:viaStream·p0.999 sample 0.051 s/op
CollectorsVsStreamToList.viaStream:viaStream·p0.9999 sample 0.051 s/op
CollectorsVsStreamToList.viaStream:viaStream·p1.00 sample 0.051 s/op
CollectorsVsStreamToList.viaCollectors ss 20 0.060 ± 0.007 s/op
CollectorsVsStreamToList.viaStream ss 20 0.043 ± 0.006 s/op
当然,领域专家的第一个问题是基准程序是否正确?测试 class 在 MacOS 上执行。如果需要任何进一步的详细信息,请告诉我。
后续,据我从读数中推断,Stream.toList
的平均时间、吞吐量和采样时间看起来比Collectors.toList
好。这样理解对吗?
Stream::toList
建立在 toArray
之上,而不是 collect
。 toArray
中有许多优化使其可能比收集更快,尽管这在很大程度上取决于细节。如果流管道(从源到最终的中间操作)是 SIZED
,则目标数组可以预先确定大小(而不是像 toList
收集器必须做的那样可能重新分配。)如果管道进一步 SUBSIZED
,那么并行执行不仅可以预先确定结果数组的大小,还可以计算精确的每个分片偏移量,因此每个子任务都可以将其结果放在正确的位置,从而无需将中间结果复制到最终结果中。
因此,根据详细信息,toList
可能比 collect
快得多。