我应该如何通过对相同值的列进行分组来读取 CSV
How Should I read CSV By Grouping Column of Same Value
我的任务是读取 CSV 文件,然后执行一些逻辑,然后为此创建一个 JSON。
在创建 JSON 之前,我有点受困于所需的逻辑,我需要针对 SK 将最大 PR 值设置为所有相同 SK 的 PR。
我的要求:
CSV:
SK,VR,ST,PR
1000,1000-Q1,10,187
1000,1000-Q2,20,925 // MAX PR against SK
1001,1001-Q1,10,112
1001,1001-Q2,30,120 // MAX PR against SK
注意:针对 SK 的最大 PR 将始终位于其 SK 的最后一行。
我必须在这里读取 CSV 并且需要写入 JSON 数据如下:
[
{
"SK": "1000",
"VR": "1000-Q1",
"ST": "10",
"PR": "925"
},
{
"SK": "1000",
"VR": "1000-Q2",
"ST": "20",
"PR": "925"
},
{
"SK": "1001",
"VR": "1001-Q1",
"ST": "10",
"PR": "120"
},
{
"SK": "1001",
"VR": "1001-Q2",
"ST": "30",
"PR": "120"
}
]
编辑:
代码
File input = new File("input.csv");
File output = new File("output.json");
CsvSchema csvSchema = CsvSchema.builder().setUseHeader(true).build();
CsvMapper csvMapper = new CsvMapper();
// Read data from CSV file
List<Object> readAll = csvMapper.readerFor(Map.class).with(csvSchema).readValues(input).readAll();
ObjectMapper mapper = new ObjectMapper();
// Write JSON formated data to output.json file
mapper.writerWithDefaultPrettyPrinter().writeValue(output, readAll);
// Write JSON formated data to stdout
System.out.println(mapper.writerWithDefaultPrettyPrinter().writeValueAsString(readAll));
一种方法是首先根据 SK
对您的 CSV 记录进行分组
String[] HEADERS = { "SK","VR","ST","PR"};
Reader in = new FileReader("mycsvfile.csv");
Iterable<CSVRecord> records = CSVFormat.DEFAULT
.withHeader(HEADERS)
.withFirstRecordAsHeader()
.parse(in);
// Group the records by SK
Map<String, List<CSVRecord>> recordListBySK = StreamSupport
.stream(records.spliterator(), false).
collect(Collectors.groupingBy(record -> record.get("SK")));
那么这次还需要再添加一个Mapping,保持每个MAX-PR
每个 Sk
Map<String, Integer> skMaxMap = recordListBySK
.entrySet()
.stream()
.collect(Collectors
.toMap( e -> e.getKey(),
e -> e.getValue()
.stream()
.mapToInt(v -> Integer.parseInt(v.get("PR")))
.max()
.getAsInt()
)
);
现在您只需像这样构建 json Sk 列表:
// Building the new sk (JSON ) objects
List<NewSk> newSkList = new ArrayList<>();
recordListBySK
.values()
.stream()
.flatMap(v -> v.stream())
.forEach(csvRecord -> {
NewSk newSk = new NewSk(csvRecord.get("SK"),
csvRecord.get("VR"),
csvRecord.get("ST"),
skMaxMap.get(csvRecord.get("SK"))
);
newSkList.add(newSk);
});
如果您尝试将它们打印出来:
newSkList.forEach(sk -> {
System.out.print(" "+sk.getSk());
System.out.print(" "+sk.getVr());
System.out.print(" "+sk.getSt());
System.out.print(" "+sk.getPr());
System.out.println(" ");
});
你会得到这个
1001 1001-Q1 10 120
1001 1001-Q2 30 120
1000 1000-Q1 10 925
1000 1000-Q2 20 925
不,您可以使用 JSON 对象映射器将列表写入 JSON 文件。
希望对你有帮助
编辑:
public class NewSk {
private String sk;
private String vr;
private String st;
private String pr;
public NewSk(String sk, String vr, String st, String pr) {
this.sk = sk;
this.vr = vr;
this.st = st;
this.pr = pr;
}
public String getSk() {
return sk;
}
public void setSk(String sk) {
this.sk = sk;
}
public String getVr() {
return vr;
}
public void setVr(String vr) {
this.vr = vr;
}
public String getSt() {
return st;
}
public void setSt(String st) {
this.st = st;
}
public String getPr() {
return pr;
}
public void setPr(String pr) {
this.pr = pr;
}
}
首先,我制作了一个 pojo 来轻松操作数据:
public class Model {
private final int sk;
private final String vr;
private final int st;
private final int pr;
public Model(int sk, String vr, int st, int pr) {
this.sk = sk;
this.vr = vr;
this.st = st;
this.pr = pr;
}
public int getSk() { return sk; }
public String getVr() { return vr; }
public int getSt() { return st; }
public int getPr() { return pr; }
@Override
public String toString() {
return "Model{" +
"sk=" + sk +
", vr='" + vr + '\'' +
", st=" + st +
", pr=" + pr +
'}';
}
}
之后,使用 Java 8 流,您可以随心所欲地聊天 :
List<Model> models = new ArrayList<>();
models.add(new Model(1000, "1000-Q1", 10, 187));
models.add(new Model(1000, "1000-Q2", 10, 925));
models.add(new Model(1001, "1001-Q1", 10, 112));
models.add(new Model(1001, "1001-Q2", 30, 120));
List<Model> collect = models.stream()
.collect(Collectors.groupingBy(Model::getSk))
.entrySet()
.stream()
.flatMap(
v -> {
int max = v.getValue()
.stream()
.map(m -> m.getPr())
.reduce((a, b) -> Math.max(a, b))
.get();
return v.getValue().stream()
.map(m -> new Model(m.getSk(), m.getVr(), m.getSt(), max));
}
)
.collect(Collectors.toList());
System.out.println(collect);
结果:
[
{sk=1000, vr='1000-Q1', st=10, pr=925},
{sk=1000, vr='1000-Q2', st=10, pr=925},
{sk=1001, vr='1001-Q1', st=10, pr=120},
{sk=1001, vr='1001-Q2', st=30, pr=120}
]
我的任务是读取 CSV 文件,然后执行一些逻辑,然后为此创建一个 JSON。
在创建 JSON 之前,我有点受困于所需的逻辑,我需要针对 SK 将最大 PR 值设置为所有相同 SK 的 PR。
我的要求:
CSV:
SK,VR,ST,PR
1000,1000-Q1,10,187
1000,1000-Q2,20,925 // MAX PR against SK
1001,1001-Q1,10,112
1001,1001-Q2,30,120 // MAX PR against SK
注意:针对 SK 的最大 PR 将始终位于其 SK 的最后一行。
我必须在这里读取 CSV 并且需要写入 JSON 数据如下:
[
{
"SK": "1000",
"VR": "1000-Q1",
"ST": "10",
"PR": "925"
},
{
"SK": "1000",
"VR": "1000-Q2",
"ST": "20",
"PR": "925"
},
{
"SK": "1001",
"VR": "1001-Q1",
"ST": "10",
"PR": "120"
},
{
"SK": "1001",
"VR": "1001-Q2",
"ST": "30",
"PR": "120"
}
]
编辑:
代码
File input = new File("input.csv");
File output = new File("output.json");
CsvSchema csvSchema = CsvSchema.builder().setUseHeader(true).build();
CsvMapper csvMapper = new CsvMapper();
// Read data from CSV file
List<Object> readAll = csvMapper.readerFor(Map.class).with(csvSchema).readValues(input).readAll();
ObjectMapper mapper = new ObjectMapper();
// Write JSON formated data to output.json file
mapper.writerWithDefaultPrettyPrinter().writeValue(output, readAll);
// Write JSON formated data to stdout
System.out.println(mapper.writerWithDefaultPrettyPrinter().writeValueAsString(readAll));
一种方法是首先根据 SK
对您的 CSV 记录进行分组 String[] HEADERS = { "SK","VR","ST","PR"};
Reader in = new FileReader("mycsvfile.csv");
Iterable<CSVRecord> records = CSVFormat.DEFAULT
.withHeader(HEADERS)
.withFirstRecordAsHeader()
.parse(in);
// Group the records by SK
Map<String, List<CSVRecord>> recordListBySK = StreamSupport
.stream(records.spliterator(), false).
collect(Collectors.groupingBy(record -> record.get("SK")));
那么这次还需要再添加一个Mapping,保持每个MAX-PR 每个 Sk
Map<String, Integer> skMaxMap = recordListBySK
.entrySet()
.stream()
.collect(Collectors
.toMap( e -> e.getKey(),
e -> e.getValue()
.stream()
.mapToInt(v -> Integer.parseInt(v.get("PR")))
.max()
.getAsInt()
)
);
现在您只需像这样构建 json Sk 列表:
// Building the new sk (JSON ) objects
List<NewSk> newSkList = new ArrayList<>();
recordListBySK
.values()
.stream()
.flatMap(v -> v.stream())
.forEach(csvRecord -> {
NewSk newSk = new NewSk(csvRecord.get("SK"),
csvRecord.get("VR"),
csvRecord.get("ST"),
skMaxMap.get(csvRecord.get("SK"))
);
newSkList.add(newSk);
});
如果您尝试将它们打印出来:
newSkList.forEach(sk -> {
System.out.print(" "+sk.getSk());
System.out.print(" "+sk.getVr());
System.out.print(" "+sk.getSt());
System.out.print(" "+sk.getPr());
System.out.println(" ");
});
你会得到这个
1001 1001-Q1 10 120
1001 1001-Q2 30 120
1000 1000-Q1 10 925
1000 1000-Q2 20 925
不,您可以使用 JSON 对象映射器将列表写入 JSON 文件。 希望对你有帮助
编辑:
public class NewSk {
private String sk;
private String vr;
private String st;
private String pr;
public NewSk(String sk, String vr, String st, String pr) {
this.sk = sk;
this.vr = vr;
this.st = st;
this.pr = pr;
}
public String getSk() {
return sk;
}
public void setSk(String sk) {
this.sk = sk;
}
public String getVr() {
return vr;
}
public void setVr(String vr) {
this.vr = vr;
}
public String getSt() {
return st;
}
public void setSt(String st) {
this.st = st;
}
public String getPr() {
return pr;
}
public void setPr(String pr) {
this.pr = pr;
}
}
首先,我制作了一个 pojo 来轻松操作数据:
public class Model {
private final int sk;
private final String vr;
private final int st;
private final int pr;
public Model(int sk, String vr, int st, int pr) {
this.sk = sk;
this.vr = vr;
this.st = st;
this.pr = pr;
}
public int getSk() { return sk; }
public String getVr() { return vr; }
public int getSt() { return st; }
public int getPr() { return pr; }
@Override
public String toString() {
return "Model{" +
"sk=" + sk +
", vr='" + vr + '\'' +
", st=" + st +
", pr=" + pr +
'}';
}
}
之后,使用 Java 8 流,您可以随心所欲地聊天 :
List<Model> models = new ArrayList<>();
models.add(new Model(1000, "1000-Q1", 10, 187));
models.add(new Model(1000, "1000-Q2", 10, 925));
models.add(new Model(1001, "1001-Q1", 10, 112));
models.add(new Model(1001, "1001-Q2", 30, 120));
List<Model> collect = models.stream()
.collect(Collectors.groupingBy(Model::getSk))
.entrySet()
.stream()
.flatMap(
v -> {
int max = v.getValue()
.stream()
.map(m -> m.getPr())
.reduce((a, b) -> Math.max(a, b))
.get();
return v.getValue().stream()
.map(m -> new Model(m.getSk(), m.getVr(), m.getSt(), max));
}
)
.collect(Collectors.toList());
System.out.println(collect);
结果:
[
{sk=1000, vr='1000-Q1', st=10, pr=925},
{sk=1000, vr='1000-Q2', st=10, pr=925},
{sk=1001, vr='1001-Q1', st=10, pr=120},
{sk=1001, vr='1001-Q2', st=30, pr=120}
]