Java 8 个流 groupby 并计算多个属性
Java 8 streams groupby and count multiple properties
我有一个包含日期和布尔错误指示符的对象进程。我想获得每个日期的总进程数和有错误的进程数。因此,例如 Jun 01 将有计数 2、1; Jun 02 将有 1, 0 和 Jun 03 1, 1。我能够做到这一点的唯一方法是流式传输两次以获得计数。我尝试过实现自定义收集器但没有成功。有没有优雅的解决方案代替我笨拙的方法?
final SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd");
final List<Process> processes = new ArrayList<>();
processes.add(new Process(sdf.parse("2016-06-01"), false));
processes.add(new Process(sdf.parse("2016-06-01"), true));
processes.add(new Process(sdf.parse("2016-06-02"), false));
processes.add(new Process(sdf.parse("2016-06-03"), true));
System.out.println(processes.stream()
.collect(
Collectors.groupingBy(Process::getDate, Collectors.counting()) ));
System.out.println(processes.stream().filter(order -> order.isHasError())
.collect(
Collectors.groupingBy(Process::getDate, Collectors.counting()) ));
private class Process {
private Date date;
private boolean hasError;
public Process(Date date, boolean hasError) {
this.date = date;
this.hasError = hasError;
}
public Date getDate() {
return date;
}
public boolean isHasError() {
return hasError;
}
}
@glee8e 的解决方案和@Holger 的提示后的代码
Collector<Process, Result, Result> ProcessCollector = Collector.of(
() -> Result::new,
(r, p) -> {
r.increment(0);
if (p.isHasError()) {
r.increment(1);
}
}, (r1, r2) -> {
r1.add(0, r2.get(0));
r1.add(1, r2.get(1));
return r1;
});
Map<Date, Result> results = Processs.stream().collect(groupingBy(Process::getDate, ProcessCollector));
results.entrySet().stream().sorted(Comparator.comparing(Entry::getKey)).forEach(entry -> System.out
.println(String.format("date = %s, %s", sdf.format(entry.getKey()), entry.getValue())));
private class Result {
private AtomicIntegerArray array = new AtomicIntegerArray(2);
public int get(int index) {
return array.get(index);
}
public void increment(int index) {
array.getAndIncrement(index);
}
public void add(int index, int delta) {
array.addAndGet(index, delta);
}
@Override
public String toString() {
return String.format("totalProcesses = %d, totalErrors = %d", array.get(0), array.get(1));
}
}
我们最好添加一个POJO来存储结果,否则组合器功能可能看起来有点晦涩。我将 POJO 声明为 public,但如果您认为隐藏它更好,您可以更改它。
public class Result {
public int all, error;
}
主要代码:
// Add it somewhere in this file.
private static final Set <Characteristics> CH_ID = Collections.unmodifiableSet(EnumSet.of(Collector.Characteristics.IDENTITY_FINISH));
//...
// This is main processing code
processes.stream().collect(collectingAndThen(groupingBy(Process::getDate, new Collector<Process, Result, Result> {
@Override
public Supplier<Result> supplier() {
return Result::new;
}
@Override
public BiConsumer<Process, Result> accumlator() {
return (p, r) -> {
r.total++;
if (p.isHasError())
r.error++;
};
}
@Override
public BinaryOperator<Result> combiner() {
return (r1, r2) -> {
r1.total += r2.total;
r1.error += r2.error;
return r1;
};
}
@Override
public Function<Result, Result> finisher() {
return Function.identity();
}
@Override
public Set<Characteristics> characteristics() {
return CH_ID;
}
})));
PS:我假设你有 import static java.util.stream.Collectors
我有一个包含日期和布尔错误指示符的对象进程。我想获得每个日期的总进程数和有错误的进程数。因此,例如 Jun 01 将有计数 2、1; Jun 02 将有 1, 0 和 Jun 03 1, 1。我能够做到这一点的唯一方法是流式传输两次以获得计数。我尝试过实现自定义收集器但没有成功。有没有优雅的解决方案代替我笨拙的方法?
final SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd");
final List<Process> processes = new ArrayList<>();
processes.add(new Process(sdf.parse("2016-06-01"), false));
processes.add(new Process(sdf.parse("2016-06-01"), true));
processes.add(new Process(sdf.parse("2016-06-02"), false));
processes.add(new Process(sdf.parse("2016-06-03"), true));
System.out.println(processes.stream()
.collect(
Collectors.groupingBy(Process::getDate, Collectors.counting()) ));
System.out.println(processes.stream().filter(order -> order.isHasError())
.collect(
Collectors.groupingBy(Process::getDate, Collectors.counting()) ));
private class Process {
private Date date;
private boolean hasError;
public Process(Date date, boolean hasError) {
this.date = date;
this.hasError = hasError;
}
public Date getDate() {
return date;
}
public boolean isHasError() {
return hasError;
}
}
@glee8e 的解决方案和@Holger 的提示后的代码
Collector<Process, Result, Result> ProcessCollector = Collector.of(
() -> Result::new,
(r, p) -> {
r.increment(0);
if (p.isHasError()) {
r.increment(1);
}
}, (r1, r2) -> {
r1.add(0, r2.get(0));
r1.add(1, r2.get(1));
return r1;
});
Map<Date, Result> results = Processs.stream().collect(groupingBy(Process::getDate, ProcessCollector));
results.entrySet().stream().sorted(Comparator.comparing(Entry::getKey)).forEach(entry -> System.out
.println(String.format("date = %s, %s", sdf.format(entry.getKey()), entry.getValue())));
private class Result {
private AtomicIntegerArray array = new AtomicIntegerArray(2);
public int get(int index) {
return array.get(index);
}
public void increment(int index) {
array.getAndIncrement(index);
}
public void add(int index, int delta) {
array.addAndGet(index, delta);
}
@Override
public String toString() {
return String.format("totalProcesses = %d, totalErrors = %d", array.get(0), array.get(1));
}
}
我们最好添加一个POJO来存储结果,否则组合器功能可能看起来有点晦涩。我将 POJO 声明为 public,但如果您认为隐藏它更好,您可以更改它。
public class Result {
public int all, error;
}
主要代码:
// Add it somewhere in this file.
private static final Set <Characteristics> CH_ID = Collections.unmodifiableSet(EnumSet.of(Collector.Characteristics.IDENTITY_FINISH));
//...
// This is main processing code
processes.stream().collect(collectingAndThen(groupingBy(Process::getDate, new Collector<Process, Result, Result> {
@Override
public Supplier<Result> supplier() {
return Result::new;
}
@Override
public BiConsumer<Process, Result> accumlator() {
return (p, r) -> {
r.total++;
if (p.isHasError())
r.error++;
};
}
@Override
public BinaryOperator<Result> combiner() {
return (r1, r2) -> {
r1.total += r2.total;
r1.error += r2.error;
return r1;
};
}
@Override
public Function<Result, Result> finisher() {
return Function.identity();
}
@Override
public Set<Characteristics> characteristics() {
return CH_ID;
}
})));
PS:我假设你有 import static java.util.stream.Collectors