为什么 Collection<T> 不实现 Stream<T>?

Why doesn't Collection<T> Implement Stream<T>?

这是一个关于API设计的问题。当在 C# 中添加扩展方法时,IEnumerable 获得了所有启用直接在所有集合上使用 lambda 表达式的方法。

随着 Java 中 lambda 和默认方法的出现,我预计 Collection 将实现 Stream 并为其所有方法提供默认实现。这样,我们就不需要调用 stream() 来利用它提供的功能。

图书馆设计师选择不太方便的方法的原因是什么?

我的猜测是,这样做是为了避免破坏实现 Collection 的现有代码。很难提供与所有现有实现都能正常工作的默认实现。

来自 Maurice Naftalin's Lambda FAQ:

Why are Stream operations not defined directly on Collection?

Early drafts of the API exposed methods like filter, map, and reduce on Collection or Iterable. However, user experience with this design led to a more formal separation of the “stream” methods into their own abstraction. Reasons included:

  • Methods on Collection such as removeAll make in-place modifications, in contrast to the new methods which are more functional in nature. Mixing two different kinds of methods on the same abstraction forces the user to keep track of which are which. For example, given the declaration

    Collection strings;
    

    the two very similar-looking method calls

    strings.removeAll(s -> s.length() == 0);
    strings.filter(s -> s.length() == 0);          // not supported in the current API
    

    would have surprisingly different results; the first would remove all empty String objects from the collection, whereas the second would return a stream containing all the non-empty Strings, while having no effect on the collection.

    Instead, the current design ensures that only an explicitly-obtained stream can be filtered:

    strings.stream().filter(s.length() == 0)...;
    

    where the ellipsis represents further stream operations, ending with a terminating operation. This gives the reader a much clearer intuition about the action of filter;

  • With lazy methods added to Collection, users were confused by a perceived—but erroneous—need to reason about whether the collection was in “lazy mode” or “eager mode”. Rather than burdening Collection with new and different functionality, it is cleaner to provide a Stream view with the new functionality;

  • The more methods added to Collection, the greater the chance of name collisions with existing third-party implementations. By only adding a few methods (stream, parallel) the chance for conflict is greatly reduced;

  • A view transformation is still needed to access a parallel view; the asymmetry between the sequential and the parallel stream views was unnatural. Compare, for example

    coll.filter(...).map(...).reduce(...);
    

    with

    coll.parallel().filter(...).map(...).reduce(...);
    

    This asymmetry would be particularly obvious in the API documentation, where Collection would have many new methods to produce sequential streams, but only one to produce parallel streams, which would then have all the same methods as Collection. Factoring these into a separate interface, StreamOps say, would not help; that would still, counterintuitively, need to be implemented by both Stream and Collection;

  • A uniform treatment of views also leaves room for other additional views in the future.

首先,来自Stream的文档:

Collections and streams, while bearing some superficial similarities, have different goals. Collections are primarily concerned with the efficient management of, and access to, their elements. By contrast, streams do not provide a means to directly access or manipulate their elements, and are instead concerned with declaratively describing their source and the computational operations which will be performed in aggregate on that source.

所以你想保留流和集合的概念。如果 Collection 将实现 Stream,则每个集合都将是一个流,但在概念上不是。现在完成的方式是,每个集合都可以为您提供一个在该集合上工作的流,如果您考虑一下,这是不同的。

想到的另一个因素是 cohesion/coupling 以及封装。如果每个实现 Collection 的 class 也必须实现 Stream 的操作,那么它将有两种(种类)不同的目的并且可能会变得太长。

  1. Collection 是一个对象模型
  2. Stream 是一个主题模型

Collection definition in doc :

A collection represents a group of objects, known as its elements.

Stream definition in doc :

A sequence of elements supporting sequential and parallel aggregate operations

这样看来,一个流 是一个 特定的集合。不是这样。因此 Collection 不应实现 Stream,无论向后兼容性如何。

那么为什么 Stream<T> 不实施 Collection<T> 呢?因为这是看待一堆物体的另一种方式。不是作为一组元素,而是通过您可以对其执行的操作。因此这就是为什么我说 Collection 是对象模型而 Stream 是主题模型的原因