如何确保 java8 流中的处理顺序?

How to ensure order of processing in java8 streams?

我想在 XML java 对象中处理列表。我必须确保按收到的顺序处理所有元素。

因此我应该在我使用的每个 stream 上调用 sequential 吗? list.stream().sequential().filter().forEach()

或者只要我不使用并行就只使用流就足够了吗? list.stream().filter().forEach()

你问错了问题。您询问的是 sequentialparallel,而您想要按顺序 处理项目 ,因此您必须询问 顺序 [=53] =].如果你有一个 ordered 流并执行保证保持顺序的操作,那么流是并行处理还是顺序处理都没有关系;实施将维持秩序。

有序 属性 不同于并行与顺序。例如。如果您在 HashSet 上调用 stream(),则在 List return 上调用 stream() 时流将是无序的。请注意,您可以调用 unordered() 来释放订购合同并可能提高性能。一旦流没有排序,就无法重新建立排序。 (将无序流变成有序流的唯一方法是调用sorted,但是,生成的顺序不一定是原始顺序)。

另见 “Ordering” section of the java.util.stream package documentation

为了确保在整个流操作中保持顺序,您必须研究流源的文档,所有中间操作和终端操作是否保持顺序(或者源是否有首先是排序)。

这可能非常微妙,例如Stream.iterate(T,UnaryOperator) creates an ordered stream while Stream.generate(Supplier) creates an unordered stream. Note that you also made a common mistake in your question as forEach does not maintain the ordering. You have to use forEachOrdered 如果您想以保证的顺序处理流的元素。

因此,如果您问题中的 list 确实是一个 java.util.List,它的 stream() 方法将 return 一个 ordered stream 和 filter 不会改变顺序。因此,如果您调用 list.stream().filter() .forEachOrdered(),所有元素将按顺序顺序处理,而对于 list.parallelStream().filter().forEachOrdered(),元素可能会并行处理(例如通过过滤器),但终端操作仍将按顺序调用(这显然会降低并行执行的好处)。

例如,如果您使用

这样的操作
List<…> result=inputList.parallelStream().map(…).filter(…).collect(Collectors.toList());

整个操作可能会受益于并行执行,但无论您使用并行流还是顺序流,结果列表的顺序总是正确的。

简而言之:

排序取决于源数据结构和中间流操作。假设您使用的是 List 处理应该是有序的(因为 filter 不会改变这里的顺序)。

更多详情:

顺序与并行与无序:

Javadocs

S sequential()
Returns an equivalent stream that is sequential. May return itself, either because the stream was already sequential, or because the underlying stream state was modified to be sequential.
This is an intermediate operation.
S parallel()
Returns an equivalent stream that is parallel. May return itself, either because the stream was already parallel, or because the underlying stream state was modified to be parallel.
This is an intermediate operation.
S unordered()
Returns an equivalent stream that is unordered. May return itself, either because the stream was already unordered, or because the underlying stream state was modified to be unordered.
This is an intermediate operation.

流排序:

Javadocs

Streams may or may not have a defined encounter order. Whether or not a stream has an encounter order depends on the source and the intermediate operations. Certain stream sources (such as List or arrays) are intrinsically ordered, whereas others (such as HashSet) are not. Some intermediate operations, such as sorted(), may impose an encounter order on an otherwise unordered stream, and others may render an ordered stream unordered, such as BaseStream.unordered(). Further, some terminal operations may ignore encounter order, such as forEach().

If a stream is ordered, most operations are constrained to operate on the elements in their encounter order; if the source of a stream is a List containing [1, 2, 3], then the result of executing map(x -> x*2) must be [2, 4, 6]. However, if the source has no defined encounter order, then any permutation of the values [2, 4, 6] would be a valid result.

For sequential streams, the presence or absence of an encounter order does not affect performance, only determinism. If a stream is ordered, repeated execution of identical stream pipelines on an identical source will produce an identical result; if it is not ordered, repeated execution might produce different results.

For parallel streams, relaxing the ordering constraint can sometimes enable more efficient execution. Certain aggregate operations, such as filtering duplicates (distinct()) or grouped reductions (Collectors.groupingBy()) can be implemented more efficiently if ordering of elements is not relevant. Similarly, operations that are intrinsically tied to encounter order, such as limit(), may require buffering to ensure proper ordering, undermining the benefit of parallelism. In cases where the stream has an encounter order, but the user does not particularly care about that encounter order, explicitly de-ordering the stream with unordered() may improve parallel performance for some stateful or terminal operations. However, most stream pipelines, such as the "sum of weight of blocks" example above, still parallelize efficiently even under ordering constraints.