如何在应用嵌套分组收集器时保留所有子组

Question

我正在尝试按性别和部门对员工列表进行分组。

如何确保所有部门都包含在每个性别的排序顺序中，即使相关性别 count 为零?

目前，我有以下代码和输出

employeeRepository.findAll().stream()
            .collect(Collectors.groupingBy(Employee::getGender, 
                        Collectors.groupingBy(Employee::getDepartment, 
                                              Collectors.counting())));

//output
//{MALE={HR=1, IT=1}, FEMALE={MGMT=1}}

首选输出是：

{MALE={HR=1, IT=1, MGMT=0}, FEMALE={HR=0, IT=0, MGMT=1}}

Answer 1

要做到这一点，首先你必须按部门分组，然后才按性别分组，而不是相反。

第一个收集器groupingBy(Employee::getDepartment, _downstream_ ) 将根据部门将数据集分成几组。由于将应用下游收集器 partitioningBy(employee -> employee.getGender() == Employee.Gender.MALE, _downstream_ )，它会根据员工性别将映射到每个部门的数据分成两个部分。最后，作为下游应用的 Collectors.counting() 将提供每个部门每个性别的 员工总数 .

因此 collect() 操作生成的中间 map 类型将是 Map<String, Map<Boolean, Long>> - employee count 性别 (Boolean) 对于每个部门（为简单起见，部门是一个纯字符串).

下一步将此地图转换为 Map<Employee.Gender, Map<String, Long>> - 员工计数 部门每个性别.

我的方法是在条目集上创建一个流，并将 每个条目 替换为一个新条目，它将包含性别作为它的 key 并且为了保存关于 department 的信息，它的 value 又将是一个条目以 department 作为键，以 count by department 作为其值。

然后通过输入键收集条目流和groupingBy。应用 mapping 作为下游收集器来提取 嵌套条目 。然后应用 Collectors.toMap() 将类型 Map.Entry<String, Long> 的条目收集到映射中。

all departments are included in a sorted order

为了确保嵌套映射中的顺序（department by count）应该使用 NavigableMap。

为了做到这一点，需要使用 toMap() 的风格，它需要 mapFactory （它还需要 mergeFunction 而不是对于此任务非常有用，因为不会有重复项，但也必须提供)。

public static void main(String[] args) {
    List<Employee> employeeRepository = 
            List.of(new Employee("IT", Employee.Gender.MALE),
                    new Employee("HR", Employee.Gender.MALE),
                    new Employee("MGMT", Employee.Gender.FEMALE));

    Map<Employee.Gender, NavigableMap<String, Long>> departmentCountByGender = employeeRepository
            .stream()
            .collect(Collectors.groupingBy(Employee::getDepartment, // Map<String, Map<Boolean, Long>> - department to *employee count* by gender
                        Collectors.partitioningBy(employee -> employee.getGender() == Employee.Gender.MALE,
                                                  Collectors.counting())))
            .entrySet().stream()
            .flatMap(entryDep -> entryDep.getValue().entrySet().stream()
                    .map(entryGen -> Map.entry(entryGen.getKey() ? Employee.Gender.MALE : Employee.Gender.FEMALE,
                                               Map.entry(entryDep.getKey(), entryGen.getValue()))))
            .collect(Collectors.groupingBy(Map.Entry::getKey,
                        Collectors.mapping(Map.Entry::getValue,
                                Collectors.toMap(Map.Entry::getKey,
                                                 Map.Entry::getValue,
                                                 (v1, v2) -> v1,
                                                 TreeMap::new))));

    System.out.println(departmentCountByGender);
}

虚拟 Employee class 用于 demo-purposes:

class Employee {
    enum Gender {FEMALE, MALE};

    private String department;
    private Gender gender;
    // etc.
    
    // constructor, getters
}

输出

{FEMALE={HR=0, IT=0, MGMT=1}, MALE={HR=1, IT=1, MGMT=0}}

Answer 2

您可以继续处理您的代码结果：

List<String> deptList = employees.stream().map(Employee::getDepartment).sorted().toList();

Map<Gender, Map<String, Long>> tmpResult = employees.stream()
        .collect(Collectors.groupingBy(Employee::getGender, Collectors.groupingBy(Employee::getDepartment, Collectors.counting())));

Map<Gender, Map<String, Long>> finalResult = new HashMap<>();

for (Map.Entry<Gender, Map<String, Long>> entry : tmpResult.entrySet()) {
    Map<String, Long> val = new LinkedHashMap<>();
    for (String dept : deptList) {
        val.put(dept, entry.getValue().getOrDefault(dept, 0L));
    }

    finalResult.put(entry.getKey(), val);
}

System.out.print(finalResult);

想一行代码达到效果，代码的可读性和可维护性可能都不太好

但是，如果您不介意使用 third-party 库，还有一种选择：abacus-common

Map<Gender, Map<String, Integer>> result = Stream.of(employees)
        .groupByToEntry(Employee::getGender, MoreCollectors.countingIntBy(Employee::getDepartment)) // step 1) group by gender 
        .mapValue(it -> Maps.newMap(deptList, Fn.identity(), dept -> it.getOrDefault(dept, 0), IntFunctions.ofLinkedHashMap())) // step 2) process the value.
        .toMap();

声明：我是 abacus-common

的开发者

如何在应用嵌套分组收集器时保留所有子组

How to preserve all Subgroups while applying nested groupingBy collector

java

counting

java-stream

collectors

groupingby