我应该如何通过对相同值的列进行分组来读取 CSV

How Should I read CSV By Grouping Column of Same Value

我的任务是读取 CSV 文件,然后执行一些逻辑,然后为此创建一个 JSON。

在创建 JSON 之前,我有点受困于所需的逻辑,我需要针对 SK 将最大 PR 值设置为所有相同 SK 的 PR。

我的要求:

CSV:

SK,VR,ST,PR
1000,1000-Q1,10,187
1000,1000-Q2,20,925  // MAX PR against SK
1001,1001-Q1,10,112
1001,1001-Q2,30,120  // MAX PR against SK

注意:针对 SK 的最大 PR 将始终位于其 SK 的最后一行。

我必须在这里读取 CSV 并且需要写入 JSON 数据如下:

[
   {
      "SK": "1000",
      "VR": "1000-Q1",
      "ST": "10",
      "PR": "925"
   },
   {
      "SK": "1000",
      "VR": "1000-Q2",
      "ST": "20",
      "PR": "925"
   },
   {
      "SK": "1001",
      "VR": "1001-Q1",
      "ST": "10",
      "PR": "120"
   },
   {
      "SK": "1001",
      "VR": "1001-Q2",
      "ST": "30",
      "PR": "120"
   }
]

编辑:

代码

       File input = new File("input.csv");
       File output = new File("output.json");
       CsvSchema csvSchema = CsvSchema.builder().setUseHeader(true).build();
       CsvMapper csvMapper = new CsvMapper();

       // Read data from CSV file
       List<Object> readAll = csvMapper.readerFor(Map.class).with(csvSchema).readValues(input).readAll();

       ObjectMapper mapper = new ObjectMapper();

       // Write JSON formated data to output.json file
       mapper.writerWithDefaultPrettyPrinter().writeValue(output, readAll);

       // Write JSON formated data to stdout
       System.out.println(mapper.writerWithDefaultPrettyPrinter().writeValueAsString(readAll));

一种方法是首先根据 SK

对您的 CSV 记录进行分组
        String[] HEADERS = { "SK","VR","ST","PR"};

        Reader in = new FileReader("mycsvfile.csv");
        Iterable<CSVRecord> records = CSVFormat.DEFAULT
          .withHeader(HEADERS)
          .withFirstRecordAsHeader()
          .parse(in);

        // Group the records by  SK
     Map<String, List<CSVRecord>> recordListBySK =   StreamSupport
            .stream(records.spliterator(), false).
            collect(Collectors.groupingBy(record -> record.get("SK")));

那么这次还需要再添加一个Mapping,保持每个MAX-PR 每个 Sk

 Map<String, Integer> skMaxMap =  recordListBySK
    .entrySet()
    .stream()
    .collect(Collectors
                .toMap( e -> e.getKey(),
                        e -> e.getValue()
                              .stream()
                              .mapToInt(v -> Integer.parseInt(v.get("PR")))
                              .max()
                              .getAsInt() 
                      )
            );

现在您只需像这样构建 json Sk 列表:

 // Building the new sk (JSON ) objects
 List<NewSk> newSkList = new ArrayList<>();
 recordListBySK
    .values()
    .stream()
    .flatMap(v -> v.stream())
    .forEach(csvRecord -> {
         NewSk newSk = new NewSk(csvRecord.get("SK"),
                                csvRecord.get("VR"),
                                csvRecord.get("ST"),
                                skMaxMap.get(csvRecord.get("SK"))
                                );
         newSkList.add(newSk);
    });

如果您尝试将它们打印出来:

newSkList.forEach(sk -> {
         System.out.print(" "+sk.getSk());
         System.out.print(" "+sk.getVr());
         System.out.print(" "+sk.getSt());
         System.out.print(" "+sk.getPr());
         System.out.println(" ");
     });

你会得到这个

 1001 1001-Q1 10 120 
 1001 1001-Q2 30 120 
 1000 1000-Q1 10 925 
 1000 1000-Q2 20 925

不,您可以使用 JSON 对象映射器将列表写入 JSON 文件。 希望对你有帮助

编辑:

public class NewSk {

    private String sk;
    private String vr;
    private String st;
    private String pr;

    public NewSk(String sk, String vr, String st, String pr) {
        this.sk = sk;
        this.vr = vr;
        this.st = st;
        this.pr = pr;
    }

    public String getSk() {
        return sk;
    }

    public void setSk(String sk) {
        this.sk = sk;
    }

    public String getVr() {
        return vr;
    }

    public void setVr(String vr) {
        this.vr = vr;
    }

    public String getSt() {
        return st;
    }

    public void setSt(String st) {
        this.st = st;
    }

    public String getPr() {
        return pr;
    }

    public void setPr(String pr) {
        this.pr = pr;
    }

}

首先,我制作了一个 pojo 来轻松操作数据:

public class Model {
    private final int sk;
    private final String vr;
    private final int st;
    private final int pr;

    public Model(int sk, String vr, int st, int pr) {
        this.sk = sk;
        this.vr = vr;
        this.st = st;
        this.pr = pr;
    }

    public int getSk() { return sk; }
    public String getVr() { return vr; }
    public int getSt() { return st; }
    public int getPr() { return pr; }

    @Override
    public String toString() {
        return "Model{" +
                "sk=" + sk +
                ", vr='" + vr + '\'' +
                ", st=" + st +
                ", pr=" + pr +
                '}';
    }
}

之后,使用 Java 8 流,您可以随心所欲地聊天 :

List<Model> models = new ArrayList<>();
models.add(new Model(1000, "1000-Q1", 10, 187));
models.add(new Model(1000, "1000-Q2", 10, 925));
models.add(new Model(1001, "1001-Q1", 10, 112));
models.add(new Model(1001, "1001-Q2", 30, 120));

List<Model> collect = models.stream()
    .collect(Collectors.groupingBy(Model::getSk))
    .entrySet()
    .stream()
    .flatMap(
        v -> {
            int max = v.getValue()
                .stream()
                .map(m -> m.getPr())
                .reduce((a, b) -> Math.max(a, b))
                .get();
            return v.getValue().stream()
                .map(m -> new Model(m.getSk(), m.getVr(), m.getSt(), max));
        }
    )
    .collect(Collectors.toList());

System.out.println(collect);

结果:

[
    {sk=1000, vr='1000-Q1', st=10, pr=925}, 
    {sk=1000, vr='1000-Q2', st=10, pr=925}, 
    {sk=1001, vr='1001-Q1', st=10, pr=120}, 
    {sk=1001, vr='1001-Q2', st=30, pr=120}
]