Java 获取超过 2 个列表的重复元素
Java get duplicated elements more than 2 list
我有 9 个列表,我想比较所有列表并获取重复的元素。
我尝试了 retainAll()
方法,但它删除了不重复的元素。
例如,当我只比较 SUPPORT 和 PROJECT_AND_SUPPORT_NAMES 时,我得到了重复的值。
但是当我像下面这样比较时 returns 是空的。
MANAGEMENT_NAMES.retainAll(SUPPORT_NAMES);
MANAGEMENT_NAMES.retainAll(PROJECT_AND_SUPPORT_NAMES);
MANAGEMENT_NAMES.retainAll(SALES_NAMES);
MANAGEMENT_NAMES.retainAll(MARKETING_NAMES);
MANAGEMENT_NAMES.retainAll(ACADEMY_NAMES);
MANAGEMENT_NAMES.retainAll(DEVELOPMENT_NAMES);
MANAGEMENT_NAMES.retainAll(HR_AND_ADMINISTRATION_NAMES);
MANAGEMENT_NAMES.retainAll(WAREHOUSE_NAMES);
SUPPORT_NAMES.retainAll(PROJECT_AND_SUPPORT_NAMES);
SUPPORT_NAMES.retainAll(SALES_NAMES);
SUPPORT_NAMES.retainAll(MARKETING_NAMES);
SUPPORT_NAMES.retainAll(ACADEMY_NAMES);
SUPPORT_NAMES.retainAll(DEVELOPMENT_NAMES);
SUPPORT_NAMES.retainAll(HR_AND_ADMINISTRATION_NAMES);
SUPPORT_NAMES.retainAll(WAREHOUSE_NAMES);
PROJECT_AND_SUPPORT_NAMES.retainAll(SALES_NAMES);
PROJECT_AND_SUPPORT_NAMES.retainAll(MARKETING_NAMES);
PROJECT_AND_SUPPORT_NAMES.retainAll(ACADEMY_NAMES);
PROJECT_AND_SUPPORT_NAMES.retainAll(DEVELOPMENT_NAMES);
PROJECT_AND_SUPPORT_NAMES.retainAll(HR_AND_ADMINISTRATION_NAMES);
PROJECT_AND_SUPPORT_NAMES.retainAll(WAREHOUSE_NAMES);
.
.
.
HR_AND_ADMINISTRATION_NAMES.retainAll(WAREHOUSE_NAMES);
我想得到这样的结果:
A 部门 - 杰瑞
B 部门 - 克里斯
C 部门 - 杰瑞
D 部门 - 克里斯
E 部门 - 克里斯
F 部门 - 杰瑞
你说这就是你想要的duplicate的意思:
"elements that appear more than once in any one of the lists"
这是一个简单的解决方案。
Set<String> duplicates = new HashSet<>();
for (List<String> list: lists) {
Set<String> all = new HashSet<>();
for (String s: list) {
if (!all.add(s)) {
duplicates.add(s);
}
}
}
逻辑是,如果字符串已经在 all
中,则只会将其添加到 duplicates
中。 (如果元素 不在 中,add
方法将 return true
。检查 javadocs。)
由于您希望给定列表中的元素重复,我们需要为每个列表重置 all
。
这可以用其他方式编码(例如使用 Java 8+ 流),但在这种情况下,最重要的是你能理解代码。
如果我没理解错的话duplicate是一个名字,在所有部门中出现不止一次。因此,您希望每个部门分别具有重复的名称。
I want to get result like :
Department A - Jerry
Department B - Chris
所以我的想法是首先创建一个 Set
所有部门的所有副本。然后根据 Set
.
分别为每个部门创建 lists
个副本
这个解决方案具有线性时间复杂度。
方法getDuplicatesForAllDepartments()
遍历所有部门的所有名字,统计每个名字出现的次数。它只保留出现不止一次的名称并将它们保存到 Set
.
方法 getDuplicatesForOneDepartments()
确定给定部门中的哪些名称包含在 Set
个重复项中。
public static void main(String[] args) {
List<List<String>> departments = List.of(SUPPORT_NAMES, PROJECT_AND_SUPPORT_NAMES, SALES_NAMES, ... etc);
// * List.of() - works with Java 9 onwards
// For Java 8 you can add departments one by one using Collection.addAll()
// List<List<String>> departments = new ArrayList<>();
// Collections.addAll(departments, SUPPORT_NAMES, PROJECT_AND_SUPPORT_NAMES, SALES_NAMES, ... etc);
Set<String> allDuplicates = getDuplicatesForAllDepartments(departments);
List<String> duplicates1 = getDuplicatesForOneDepartment(SUPPORT_NAMES, allDuplicates);
List<String> duplicatesProjectAndSupport = getDuplicatesForOneDepartment(PROJECT_AND_SUPPORT_NAMES, allDuplicates);
// ... etc
}
public static Set<String> getDuplicatesForAllDepartments(List<List<String>> departments) {
return departments.stream()
.flatMap(List::stream)
.collect(Collectors.groupingBy(UnaryOperator.identity(),
Collectors.counting()))
.entrySet().stream()
.filter(entry -> entry.getValue() > 1)
.map(Map.Entry::getKey)
.collect(Collectors.toSet());
}
public static List<String> getDuplicatesForOneDepartment(List<String> department, Set<String> allDuplicates) {
return department.stream()
.filter(allDuplicates::contains)
.collect(Collectors.toList());
}
方法 getDuplicatesForAllDepartments()
和 getDuplicatesForOneDepartment()
的命令式实现如下所示:
public static Set<String> getDuplicatesForAllDepartments(List<List<String>> departments) {
Map<String, Integer> nameToCount = new HashMap<>();
for (List<String> department: departments) {
for (String name: department) {
nameToCount.merge(name, 1, Integer::sum);
}
}
Set<String> duplicates = new HashSet<>();
for (Map.Entry<String, Integer> entry: nameToCount.entrySet()) {
if (entry.getValue() > 1) {
duplicates.add(entry.getKey());
}
}
return duplicates;
}
public static List<String> getDuplicatesForOneDepartment(List<String> department, Set<String> allDuplicates) {
List<String> duplicates = new ArrayList<>();
for (String name: department) {
if (allDuplicates.contains((name))) {
duplicates.add(name);
}
}
return duplicates;
}
我已经使用您的示例来证明这两种实现都有效。
List<String> departmentA = List.of("Jerry", "Stephen", "Daniel");
// for Java 8 Arrays.asList("Jerry", "Stephen", "Daniel");
List<String> departmentB = List.of("Chris", "Earl", "Ryan");
List<String> departmentC = List.of("Jerry", "Brown", "Micheal");
List<List<String>> departments = new ArrayList<>();
Collections.addAll(departments, departmentA, departmentB, departmentC);
System.out.println(getDuplicatesForAllDepartments(departments));
输出
[Jerry]
我有 9 个列表,我想比较所有列表并获取重复的元素。
我尝试了 retainAll()
方法,但它删除了不重复的元素。
例如,当我只比较 SUPPORT 和 PROJECT_AND_SUPPORT_NAMES 时,我得到了重复的值。
但是当我像下面这样比较时 returns 是空的。
MANAGEMENT_NAMES.retainAll(SUPPORT_NAMES);
MANAGEMENT_NAMES.retainAll(PROJECT_AND_SUPPORT_NAMES);
MANAGEMENT_NAMES.retainAll(SALES_NAMES);
MANAGEMENT_NAMES.retainAll(MARKETING_NAMES);
MANAGEMENT_NAMES.retainAll(ACADEMY_NAMES);
MANAGEMENT_NAMES.retainAll(DEVELOPMENT_NAMES);
MANAGEMENT_NAMES.retainAll(HR_AND_ADMINISTRATION_NAMES);
MANAGEMENT_NAMES.retainAll(WAREHOUSE_NAMES);
SUPPORT_NAMES.retainAll(PROJECT_AND_SUPPORT_NAMES);
SUPPORT_NAMES.retainAll(SALES_NAMES);
SUPPORT_NAMES.retainAll(MARKETING_NAMES);
SUPPORT_NAMES.retainAll(ACADEMY_NAMES);
SUPPORT_NAMES.retainAll(DEVELOPMENT_NAMES);
SUPPORT_NAMES.retainAll(HR_AND_ADMINISTRATION_NAMES);
SUPPORT_NAMES.retainAll(WAREHOUSE_NAMES);
PROJECT_AND_SUPPORT_NAMES.retainAll(SALES_NAMES);
PROJECT_AND_SUPPORT_NAMES.retainAll(MARKETING_NAMES);
PROJECT_AND_SUPPORT_NAMES.retainAll(ACADEMY_NAMES);
PROJECT_AND_SUPPORT_NAMES.retainAll(DEVELOPMENT_NAMES);
PROJECT_AND_SUPPORT_NAMES.retainAll(HR_AND_ADMINISTRATION_NAMES);
PROJECT_AND_SUPPORT_NAMES.retainAll(WAREHOUSE_NAMES);
.
.
.
HR_AND_ADMINISTRATION_NAMES.retainAll(WAREHOUSE_NAMES);
我想得到这样的结果:
A 部门 - 杰瑞
B 部门 - 克里斯
C 部门 - 杰瑞
D 部门 - 克里斯
E 部门 - 克里斯
F 部门 - 杰瑞
你说这就是你想要的duplicate的意思:
"elements that appear more than once in any one of the lists"
这是一个简单的解决方案。
Set<String> duplicates = new HashSet<>();
for (List<String> list: lists) {
Set<String> all = new HashSet<>();
for (String s: list) {
if (!all.add(s)) {
duplicates.add(s);
}
}
}
逻辑是,如果字符串已经在 all
中,则只会将其添加到 duplicates
中。 (如果元素 不在 中,add
方法将 return true
。检查 javadocs。)
由于您希望给定列表中的元素重复,我们需要为每个列表重置 all
。
这可以用其他方式编码(例如使用 Java 8+ 流),但在这种情况下,最重要的是你能理解代码。
如果我没理解错的话duplicate是一个名字,在所有部门中出现不止一次。因此,您希望每个部门分别具有重复的名称。
I want to get result like :
Department A - Jerry Department B - Chris
所以我的想法是首先创建一个 Set
所有部门的所有副本。然后根据 Set
.
lists
个副本
这个解决方案具有线性时间复杂度。
方法getDuplicatesForAllDepartments()
遍历所有部门的所有名字,统计每个名字出现的次数。它只保留出现不止一次的名称并将它们保存到 Set
.
方法 getDuplicatesForOneDepartments()
确定给定部门中的哪些名称包含在 Set
个重复项中。
public static void main(String[] args) {
List<List<String>> departments = List.of(SUPPORT_NAMES, PROJECT_AND_SUPPORT_NAMES, SALES_NAMES, ... etc);
// * List.of() - works with Java 9 onwards
// For Java 8 you can add departments one by one using Collection.addAll()
// List<List<String>> departments = new ArrayList<>();
// Collections.addAll(departments, SUPPORT_NAMES, PROJECT_AND_SUPPORT_NAMES, SALES_NAMES, ... etc);
Set<String> allDuplicates = getDuplicatesForAllDepartments(departments);
List<String> duplicates1 = getDuplicatesForOneDepartment(SUPPORT_NAMES, allDuplicates);
List<String> duplicatesProjectAndSupport = getDuplicatesForOneDepartment(PROJECT_AND_SUPPORT_NAMES, allDuplicates);
// ... etc
}
public static Set<String> getDuplicatesForAllDepartments(List<List<String>> departments) {
return departments.stream()
.flatMap(List::stream)
.collect(Collectors.groupingBy(UnaryOperator.identity(),
Collectors.counting()))
.entrySet().stream()
.filter(entry -> entry.getValue() > 1)
.map(Map.Entry::getKey)
.collect(Collectors.toSet());
}
public static List<String> getDuplicatesForOneDepartment(List<String> department, Set<String> allDuplicates) {
return department.stream()
.filter(allDuplicates::contains)
.collect(Collectors.toList());
}
方法 getDuplicatesForAllDepartments()
和 getDuplicatesForOneDepartment()
的命令式实现如下所示:
public static Set<String> getDuplicatesForAllDepartments(List<List<String>> departments) {
Map<String, Integer> nameToCount = new HashMap<>();
for (List<String> department: departments) {
for (String name: department) {
nameToCount.merge(name, 1, Integer::sum);
}
}
Set<String> duplicates = new HashSet<>();
for (Map.Entry<String, Integer> entry: nameToCount.entrySet()) {
if (entry.getValue() > 1) {
duplicates.add(entry.getKey());
}
}
return duplicates;
}
public static List<String> getDuplicatesForOneDepartment(List<String> department, Set<String> allDuplicates) {
List<String> duplicates = new ArrayList<>();
for (String name: department) {
if (allDuplicates.contains((name))) {
duplicates.add(name);
}
}
return duplicates;
}
我已经使用您的示例来证明这两种实现都有效。
List<String> departmentA = List.of("Jerry", "Stephen", "Daniel");
// for Java 8 Arrays.asList("Jerry", "Stephen", "Daniel");
List<String> departmentB = List.of("Chris", "Earl", "Ryan");
List<String> departmentC = List.of("Jerry", "Brown", "Micheal");
List<List<String>> departments = new ArrayList<>();
Collections.addAll(departments, departmentA, departmentB, departmentC);
System.out.println(getDuplicatesForAllDepartments(departments));
输出
[Jerry]