Java 获取超过 2 个列表的重复元素

Java get duplicated elements more than 2 list

我有 9 个列表,我想比较所有列表并获取重复的元素。

我尝试了 retainAll() 方法,但它删除了不重复的元素。

例如,当我只比较 SUPPORT 和 PROJECT_AND_SUPPORT_NAMES 时,我得到了重复的值。

但是当我像下面这样比较时 returns 是空的。

        MANAGEMENT_NAMES.retainAll(SUPPORT_NAMES);
        MANAGEMENT_NAMES.retainAll(PROJECT_AND_SUPPORT_NAMES);
        MANAGEMENT_NAMES.retainAll(SALES_NAMES);
        MANAGEMENT_NAMES.retainAll(MARKETING_NAMES);
        MANAGEMENT_NAMES.retainAll(ACADEMY_NAMES);
        MANAGEMENT_NAMES.retainAll(DEVELOPMENT_NAMES);
        MANAGEMENT_NAMES.retainAll(HR_AND_ADMINISTRATION_NAMES);
        MANAGEMENT_NAMES.retainAll(WAREHOUSE_NAMES);

        SUPPORT_NAMES.retainAll(PROJECT_AND_SUPPORT_NAMES);
        SUPPORT_NAMES.retainAll(SALES_NAMES);
        SUPPORT_NAMES.retainAll(MARKETING_NAMES);
        SUPPORT_NAMES.retainAll(ACADEMY_NAMES);
        SUPPORT_NAMES.retainAll(DEVELOPMENT_NAMES);
        SUPPORT_NAMES.retainAll(HR_AND_ADMINISTRATION_NAMES);
        SUPPORT_NAMES.retainAll(WAREHOUSE_NAMES);

        PROJECT_AND_SUPPORT_NAMES.retainAll(SALES_NAMES);
        PROJECT_AND_SUPPORT_NAMES.retainAll(MARKETING_NAMES);
        PROJECT_AND_SUPPORT_NAMES.retainAll(ACADEMY_NAMES);
        PROJECT_AND_SUPPORT_NAMES.retainAll(DEVELOPMENT_NAMES);
        PROJECT_AND_SUPPORT_NAMES.retainAll(HR_AND_ADMINISTRATION_NAMES);
        PROJECT_AND_SUPPORT_NAMES.retainAll(WAREHOUSE_NAMES);
.
.
.

        HR_AND_ADMINISTRATION_NAMES.retainAll(WAREHOUSE_NAMES);

我想得到这样的结果:

A 部门 - 杰瑞

B 部门 - 克里斯

C 部门 - 杰瑞

D 部门 - 克里斯

E 部门 - 克里斯

F 部门 - 杰瑞

你说这就是你想要的duplicate的意思:

"elements that appear more than once in any one of the lists"

这是一个简单的解决方案。

Set<String> duplicates = new HashSet<>();

for (List<String> list: lists) {
    Set<String> all = new HashSet<>();
    for (String s: list) {
        if (!all.add(s)) {
            duplicates.add(s);
        }
    }
}

逻辑是,如果字符串已经在 all 中,则只会将其添加到 duplicates 中。 (如果元素 不在 中,add 方法将 return true。检查 javadocs。)

由于您希望给定列表中的元素重复,我们需要为每个列表重置 all

这可以用其他方式编码(例如使用 Java 8+ 流),但在这种情况下,最重要的是你能理解代码。

如果我没理解错的话duplicate是一个名字,在所有部门中出现不止一次。因此,您希望每个部门分别具有重复的名称。

I want to get result like :

Department A - Jerry Department B - Chris

所以我的想法是首先创建一个 Set 所有部门的所有副本。然后根据 Set.

分别为每个部门创建 lists 个副本

这个解决方案具有线性时间复杂度。

方法getDuplicatesForAllDepartments()遍历所有部门的所有名字,统计每个名字出现的次数。它只保留出现不止一次的名称并将它们保存到 Set.

方法 getDuplicatesForOneDepartments() 确定给定部门中的哪些名称包含在 Set 个重复项中。

    public static void main(String[] args) {
        List<List<String>> departments = List.of(SUPPORT_NAMES, PROJECT_AND_SUPPORT_NAMES, SALES_NAMES, ... etc);
//         * List.of() - works with Java 9 onwards
//        For Java 8 you can add departments one by one using Collection.addAll()
//        List<List<String>> departments = new ArrayList<>();
//        Collections.addAll(departments, SUPPORT_NAMES, PROJECT_AND_SUPPORT_NAMES, SALES_NAMES, ... etc);

        Set<String> allDuplicates = getDuplicatesForAllDepartments(departments);
        List<String> duplicates1 = getDuplicatesForOneDepartment(SUPPORT_NAMES, allDuplicates);
        List<String> duplicatesProjectAndSupport = getDuplicatesForOneDepartment(PROJECT_AND_SUPPORT_NAMES, allDuplicates);
        // ... etc
    }

    public static Set<String> getDuplicatesForAllDepartments(List<List<String>> departments) {
        return departments.stream()
                .flatMap(List::stream)
                .collect(Collectors.groupingBy(UnaryOperator.identity(), 
                                               Collectors.counting()))
                .entrySet().stream()
                .filter(entry -> entry.getValue() > 1)
                .map(Map.Entry::getKey)
                .collect(Collectors.toSet());
    }
    
    public static List<String> getDuplicatesForOneDepartment(List<String> department, Set<String> allDuplicates) {
        return department.stream()
                .filter(allDuplicates::contains)
                .collect(Collectors.toList());
    }

方法 getDuplicatesForAllDepartments()getDuplicatesForOneDepartment() 的命令式实现如下所示:

    public static Set<String> getDuplicatesForAllDepartments(List<List<String>> departments) {
        Map<String, Integer> nameToCount = new HashMap<>();
        for (List<String> department: departments) {
            for (String name: department) {
                nameToCount.merge(name, 1, Integer::sum);
            }
        }

        Set<String> duplicates = new HashSet<>();
        for (Map.Entry<String, Integer> entry: nameToCount.entrySet()) {
            if (entry.getValue() > 1) {
                duplicates.add(entry.getKey());
            }
        }
        return duplicates;
    }

    public static List<String> getDuplicatesForOneDepartment(List<String> department, Set<String> allDuplicates) {
        List<String> duplicates = new ArrayList<>();
        for (String name: department) {
            if (allDuplicates.contains((name))) {
                duplicates.add(name);
            }
        }
        return duplicates;
    }

我已经使用您的示例来证明这两种实现都有效。

        List<String> departmentA = List.of("Jerry", "Stephen", "Daniel");
        // for Java 8 Arrays.asList("Jerry", "Stephen", "Daniel");
        List<String> departmentB = List.of("Chris", "Earl", "Ryan");
        List<String> departmentC = List.of("Jerry", "Brown", "Micheal");

        List<List<String>> departments = new ArrayList<>();
        Collections.addAll(departments, departmentA, departmentB, departmentC);
        System.out.println(getDuplicatesForAllDepartments(departments));

输出

[Jerry]