Java stream - 根据特定字段查找出现频率最高的元素

Java stream - find most frequent element based on a specific field

我有一个 Person 对象的列表,我想在列表中找到最常见的名称和频率,仅使用 java 流。 (当出现平局时,return 任何结果)

目前,我的解决方案使用 groupingBycounting,然后再次在生成的地图中找到 max 元素。 当前解决方案对输入进行 2 次传递 (list/map)。 是否有可能使它更有效和更易读?

Person p1 = Person.builder().id("p1").name("Alice").age(1).build();
Person p2 = Person.builder().id("p2").name("Bob").age(2).build();
Person p3 = Person.builder().id("p3").name("Charlie").age(3).build();
Person p4 = Person.builder().id("p4").name("Alice").age(4).build();
List<Person> people = ImmutableList.of(p1, p2, p3, p4);

Map.Entry<String, Long> mostCommonName = people
        .stream()
        .collect(collectingAndThen(groupingBy(Person::getName, counting()),
                map -> map.entrySet().stream().max(Map.Entry.comparingByValue()).orElse(null)
        ));

System.out.println(mostCommonName); // Alice=2

使用循环和 Map::merge 函数可以立即返回计算出的频率值,将两个通道压缩为一个:

String mostCommonName = null;
int maxFreq = 0;
Map<String, Integer> freq = new HashMap<>();

for (Person p : people) {
    if (freq.merge(p.getName(), 1, Integer::sum) > maxFreq) {
        maxFreq = freq.get(p.getName());
        mostCommonName = p.getName();
    }
}

System.out.printf("Most common name '%s' occurred %d times.%n", mostCommonName, maxFreq);

如果您坚持只使用流,那么您最好的选择可能是拥有一个自定义收集器,其中包含在一次传递中聚合所需的信息:

class MaxNameFinder implements Collector<Person, ?, String> {
    public class Accumulator {
        private final Map<String,Integer> nameFrequency = new HashMap<>();
        private int modeFrequency = 0;
        private String modeName = null;

        public String getModeName() {
            return modeName;
        }

        public void accept(Person person) {
            currentFrequency = frequency.merge(p.getName(), 1, Integer::sum);
            if (currentFrequency > modeFrequency) {
                modeName = person.getName();
                modeFrequency = currentFrequency;
            }
        }

        public Accumulator combine(Accumulator other) {
            other.frequency.forEach((n, f) -> this.frequency.merge(n, f, Integer::sum));
            if (this.frequency.get(other.modeName) > frequency.get(this.modeName))
                modeName = other.modeName;
            modeFrequency = frequency.get(modeName);
            return this;
        };

    }

    public BiConsumer<Accumulator,​Person> accumulator() {
        return Accumulator::accept;
    }

    public Set<Collector.Characteristics> characteristics() {
        return Set.of(Collector.Characteristics.CONCURRENT);
    }

    public BinaryOperator<Accumulator> combiner() {
        return Accumulator::combine;
    }

    public Function<Accumulator,String> finisher() {
        return Accumulator::getModeName;
    }

    public Supplier<Accumulator> supplier() {
        return Accumulator::new;
    }
}

用法为:

people.stream().collect(new MaxNameFinder())

这将 return 一个表示最常见名称的字符串。