Java 8 按唯一名称过滤对象列表,同时仅保留最高 ID?
Java 8 filtering list of objects by unique name while only keeping highest ID?
假设我们有一个人 class 具有以下字段:
Class Person {
private String name;
private Integer id (this one is unique);
}
然后我们有一个 List<Person> people
这样:
['Jerry', 993]
['Tom', 3]
['Neal', 443]
['Jerry', 112]
['Shannon', 259]
['Shannon', 533]
我如何制作一个新的 List<Person> uniqueNames
以便它仅过滤唯一名称并保留该名称的最高 ID。
因此最终列表如下所示:
['Jerry', 993]
['Tom', 3]
['Neal', 443]
['Shannon', 533]
Collectors.groupingBy
+ Collectors.maxBy
应该可以构建按姓名分组的人员地图,然后选择最大值:
List<Person> persons = Arrays.asList(
new Person("Jerry", 123),
new Person("Tom", 234),
new Person("Jerry", 456),
new Person("Jake", 789)
);
List<Person> maxById = persons
.stream()
.collect(Collectors.groupingBy(
Person::getName,
Collectors.maxBy(Comparator.comparingInt(Person::getID))
))
.values() // Collection<Optional<Person>>
.stream() // Stream<Optional<Person>>
.map(opt -> opt.orElse(null))
.collect(Collectors.toList());
System.out.println(maxById);
输出:
[789: Jake, 234: Tom, 456: Jerry]
更新
is there a way to get a separate list of the Person object who were deleted because they were duplicates within this stream()?
最好将分组的项目收集在一个列表中,然后在一些包装器中进行转换 class 提供有关 maxById
人和去重人员列表的信息:
class PersonList {
private final Person max;
private final List<Person> deduped;
public PersonList(List<Person> group) {
this.max = Collections.max(group, Comparator.comparingInt(Person::getID));
this.deduped = new ArrayList<>(group);
this.deduped.removeIf(p -> p.getID() == max.getID());
}
@Override
public String toString() {
return "{max: " + max + "; deduped: " + deduped + "}";
}
}
那么人应该是这样收集的:
List<PersonList> maxByIdDetails = new ArrayList<>(persons
.stream()
.collect(Collectors.groupingBy(
Person::getName,
LinkedHashMap::new,
Collectors.collectingAndThen(
Collectors.toList(), PersonList::new
)
))
.values()); // Collection<PersonList>
maxByIdDetails.forEach(System.out::println);
输出:
{max: 456: Jerry; deduped: [123: Jerry]}
{max: 234: Tom; deduped: []}
{max: 789: Jake; deduped: []}
更新 2
正在获取重复人员列表:
List<Person> duplicates = persons
.stream()
.collect(Collectors.groupingBy(Person::getName))
.values() // Collection<List<Person>>
.stream() // Stream<List<Person>>
.map(MyClass::removeMax)
.flatMap(List::stream) // Stream<Person>
.collect(Collectors.toList()); // List<Person>
System.out.println(duplicates);
输出:
[123: Jerry]
其中 removeMax
可以这样实现:
private static List<Person> removeMax(List<Person> group) {
List<Person> dupes = new ArrayList<>();
Person max = null;
for (Person p : group) {
Person duped = null;
if (null == max) {
max = p;
} else if (p.getID() > max.getID()) {
duped = max;
max = p;
} else {
duped = p;
}
if (null != duped) {
dupes.add(duped);
}
}
return dupes;
}
或者,如果 hashCode
和 equals
在 class Person
中正确实现,则可以使用 removeAll
计算两个列表之间的差异:
List<Person> duplicates2 = new ArrayList<>(persons);
duplicates2.removeAll(maxById);
System.out.println(duplicates2);
你可以试试:
import static java.util.stream.Collectors.*;
persons.stream()
.collect(
groupingBy(
Person::getName,
collectingAndThen(
maxBy(comparingInt(Person::getId)),
Optional::get
)
)
)
.values()
;
- 你按名字分组
- 然后你请求分组的最大人数(每个名字)
- 然后你returns值(因为
groupingBy
returns一个Map<String, Optional<Person>>
,collectAndThen
调用的Optional::get
)。
请注意,这将列出唯一的名称,但不会列出重复的名称。
你可以这样使用Collectors#toMap
。
record Person(String name, Integer id) {}
public static void main(String[] args) {
List<Person> input = List.of(
new Person("Jerry", 993),
new Person("Tom", 3),
new Person("Neal", 443),
new Person("Jerry", 112),
new Person("Shannon", 259),
new Person("Shannon", 533));
List<Person> output = input.stream()
.collect(Collectors.toMap(Person::name, Function.identity(),
(a, b) -> a.id() > b.id() ? a : b, LinkedHashMap::new))
.values().stream().toList();
for (Person e : output)
System.out.println(e);
}
输出:
Person[name=Jerry, id=993]
Person[name=Tom, id=3]
Person[name=Neal, id=443]
Person[name=Shannon, id=533]
如果不在意顺序,可以省略, LinkedHashMap::new
。
is there a way to get a separate list of the Person object who were
deleted because they were duplicates within this stream()?
private static final Map<String, Person> highestIds = new HashMap<>();
private static final List<Person> duplicates = new ArrayList<>();
public static void main(String[] args) {
for (Person person : people) {
Person result = highestIds.get(person.name);
if (isPresent(result) && person.id > result.id) {
duplicates.add(result);
highestIds.put(person.name, person);
} else if (result == null) {
highestIds.put(person.name, person);
} else {
duplicates.add(person);
}
}
System.out.println("Highest ids:");
highestIds.values().forEach(System.out::println);
System.out.println("Duplicates:");
duplicates.forEach(System.out::println);
}
private static boolean isPresent(Person result) {
return result != null;
}
假设我们有一个人 class 具有以下字段:
Class Person {
private String name;
private Integer id (this one is unique);
}
然后我们有一个 List<Person> people
这样:
['Jerry', 993]
['Tom', 3]
['Neal', 443]
['Jerry', 112]
['Shannon', 259]
['Shannon', 533]
我如何制作一个新的 List<Person> uniqueNames
以便它仅过滤唯一名称并保留该名称的最高 ID。
因此最终列表如下所示:
['Jerry', 993]
['Tom', 3]
['Neal', 443]
['Shannon', 533]
Collectors.groupingBy
+ Collectors.maxBy
应该可以构建按姓名分组的人员地图,然后选择最大值:
List<Person> persons = Arrays.asList(
new Person("Jerry", 123),
new Person("Tom", 234),
new Person("Jerry", 456),
new Person("Jake", 789)
);
List<Person> maxById = persons
.stream()
.collect(Collectors.groupingBy(
Person::getName,
Collectors.maxBy(Comparator.comparingInt(Person::getID))
))
.values() // Collection<Optional<Person>>
.stream() // Stream<Optional<Person>>
.map(opt -> opt.orElse(null))
.collect(Collectors.toList());
System.out.println(maxById);
输出:
[789: Jake, 234: Tom, 456: Jerry]
更新
is there a way to get a separate list of the Person object who were deleted because they were duplicates within this stream()?
最好将分组的项目收集在一个列表中,然后在一些包装器中进行转换 class 提供有关 maxById
人和去重人员列表的信息:
class PersonList {
private final Person max;
private final List<Person> deduped;
public PersonList(List<Person> group) {
this.max = Collections.max(group, Comparator.comparingInt(Person::getID));
this.deduped = new ArrayList<>(group);
this.deduped.removeIf(p -> p.getID() == max.getID());
}
@Override
public String toString() {
return "{max: " + max + "; deduped: " + deduped + "}";
}
}
那么人应该是这样收集的:
List<PersonList> maxByIdDetails = new ArrayList<>(persons
.stream()
.collect(Collectors.groupingBy(
Person::getName,
LinkedHashMap::new,
Collectors.collectingAndThen(
Collectors.toList(), PersonList::new
)
))
.values()); // Collection<PersonList>
maxByIdDetails.forEach(System.out::println);
输出:
{max: 456: Jerry; deduped: [123: Jerry]}
{max: 234: Tom; deduped: []}
{max: 789: Jake; deduped: []}
更新 2
正在获取重复人员列表:
List<Person> duplicates = persons
.stream()
.collect(Collectors.groupingBy(Person::getName))
.values() // Collection<List<Person>>
.stream() // Stream<List<Person>>
.map(MyClass::removeMax)
.flatMap(List::stream) // Stream<Person>
.collect(Collectors.toList()); // List<Person>
System.out.println(duplicates);
输出:
[123: Jerry]
其中 removeMax
可以这样实现:
private static List<Person> removeMax(List<Person> group) {
List<Person> dupes = new ArrayList<>();
Person max = null;
for (Person p : group) {
Person duped = null;
if (null == max) {
max = p;
} else if (p.getID() > max.getID()) {
duped = max;
max = p;
} else {
duped = p;
}
if (null != duped) {
dupes.add(duped);
}
}
return dupes;
}
或者,如果 hashCode
和 equals
在 class Person
中正确实现,则可以使用 removeAll
计算两个列表之间的差异:
List<Person> duplicates2 = new ArrayList<>(persons);
duplicates2.removeAll(maxById);
System.out.println(duplicates2);
你可以试试:
import static java.util.stream.Collectors.*;
persons.stream()
.collect(
groupingBy(
Person::getName,
collectingAndThen(
maxBy(comparingInt(Person::getId)),
Optional::get
)
)
)
.values()
;
- 你按名字分组
- 然后你请求分组的最大人数(每个名字)
- 然后你returns值(因为
groupingBy
returns一个Map<String, Optional<Person>>
,collectAndThen
调用的Optional::get
)。
请注意,这将列出唯一的名称,但不会列出重复的名称。
你可以这样使用Collectors#toMap
。
record Person(String name, Integer id) {}
public static void main(String[] args) {
List<Person> input = List.of(
new Person("Jerry", 993),
new Person("Tom", 3),
new Person("Neal", 443),
new Person("Jerry", 112),
new Person("Shannon", 259),
new Person("Shannon", 533));
List<Person> output = input.stream()
.collect(Collectors.toMap(Person::name, Function.identity(),
(a, b) -> a.id() > b.id() ? a : b, LinkedHashMap::new))
.values().stream().toList();
for (Person e : output)
System.out.println(e);
}
输出:
Person[name=Jerry, id=993]
Person[name=Tom, id=3]
Person[name=Neal, id=443]
Person[name=Shannon, id=533]
如果不在意顺序,可以省略, LinkedHashMap::new
。
is there a way to get a separate list of the Person object who were deleted because they were duplicates within this stream()?
private static final Map<String, Person> highestIds = new HashMap<>();
private static final List<Person> duplicates = new ArrayList<>();
public static void main(String[] args) {
for (Person person : people) {
Person result = highestIds.get(person.name);
if (isPresent(result) && person.id > result.id) {
duplicates.add(result);
highestIds.put(person.name, person);
} else if (result == null) {
highestIds.put(person.name, person);
} else {
duplicates.add(person);
}
}
System.out.println("Highest ids:");
highestIds.values().forEach(System.out::println);
System.out.println("Duplicates:");
duplicates.forEach(System.out::println);
}
private static boolean isPresent(Person result) {
return result != null;
}