文本限定符 - 封装的 tokn 和定界符之间的无效字符
Text qualifier - invalid char between tokn encapsulated and delimiter
如果字段中有一个逗号,但整体用引号括起来,那么我不应该把它当作列分隔符。如何做到这一点?
示例 aaaa, "bb,bb", cccc
我得到 aaaa | bb | bb |ccc
我怎样才能收到aaaa | “bb,bb” | cccc ?
public List<CSVRecord> collectAllEntries(Path path) throws IOException {
logger.info("Parsing the input file" + path);
List<CSVRecord> store = new ArrayList<>();
try (
Reader reader = Files.newBufferedReader(path, Charset.forName("ISO-8859-2"));
CSVParser csvParser = new CSVParser(reader, CSVFormat.EXCEL.withQuote(';'))
) {
for (CSVRecord csvRecord : csvParser) {
store.add(csvRecord);
}
} catch (IOException e) {
e.printStackTrace();
throw e;
}
return store;
}
private void csvToXlsx(Path csvFilePath, Path excelFilePath) throws Exception {
logger.info("Converting CSV to XLSX" + excelFilePath);
List<CSVRecord> records = collectAllEntries(csvFilePath);
XSSFWorkbook myWorkBook = new XSSFWorkbook();
FileOutputStream writer = new FileOutputStream(new File(excelFilePath.toString()));
XSSFSheet mySheet = myWorkBook.createSheet();
IntStream.range(0, records.size())
.forEach(rowNum -> {
XSSFRow myRow = mySheet.createRow(rowNum);
CSVRecord record = records.get(rowNum);
for (int i = 0; i < record.size(); i++) {
XSSFCell myCell = myRow.createCell(i);
myCell.setCellValue(record.get(i));
}
});
myWorkBook.write(writer);
writer.close();
}
使用最新版本的 commons-csv-1.8 时,以下内容对我有用:
Reader in = new StringReader("aaaa,\"bb,bb\",cccc");
Iterable<CSVRecord> records = CSVFormat.DEFAULT.withDelimiter(',').withQuote('"').parse(in);
for (CSVRecord record : records) {
for (int i = 0; i < record.size(); i++) {
System.out.println("At " + i + ": " + record.get(i));
}
}
以及使用预定义的 EXCEL 格式:
Iterable<CSVRecord> records = CSVFormat.EXCEL.parse(in);
private void processOrderSet(HashMap<String, List<CSVRecord>> entries, FileWriter out, List<String> headers) throws IOException {
try (CSVPrinter printer = new CSVPrinter(out, CSVFormat.EXCEL.withHeader(headers.toArray(new String[0])).withQuote('"').withDelimiter(';')))
.....
如果字段中有一个逗号,但整体用引号括起来,那么我不应该把它当作列分隔符。如何做到这一点?
示例 aaaa, "bb,bb", cccc
我得到 aaaa | bb | bb |ccc
我怎样才能收到aaaa | “bb,bb” | cccc ?
public List<CSVRecord> collectAllEntries(Path path) throws IOException {
logger.info("Parsing the input file" + path);
List<CSVRecord> store = new ArrayList<>();
try (
Reader reader = Files.newBufferedReader(path, Charset.forName("ISO-8859-2"));
CSVParser csvParser = new CSVParser(reader, CSVFormat.EXCEL.withQuote(';'))
) {
for (CSVRecord csvRecord : csvParser) {
store.add(csvRecord);
}
} catch (IOException e) {
e.printStackTrace();
throw e;
}
return store;
}
private void csvToXlsx(Path csvFilePath, Path excelFilePath) throws Exception {
logger.info("Converting CSV to XLSX" + excelFilePath);
List<CSVRecord> records = collectAllEntries(csvFilePath);
XSSFWorkbook myWorkBook = new XSSFWorkbook();
FileOutputStream writer = new FileOutputStream(new File(excelFilePath.toString()));
XSSFSheet mySheet = myWorkBook.createSheet();
IntStream.range(0, records.size())
.forEach(rowNum -> {
XSSFRow myRow = mySheet.createRow(rowNum);
CSVRecord record = records.get(rowNum);
for (int i = 0; i < record.size(); i++) {
XSSFCell myCell = myRow.createCell(i);
myCell.setCellValue(record.get(i));
}
});
myWorkBook.write(writer);
writer.close();
}
使用最新版本的 commons-csv-1.8 时,以下内容对我有用:
Reader in = new StringReader("aaaa,\"bb,bb\",cccc");
Iterable<CSVRecord> records = CSVFormat.DEFAULT.withDelimiter(',').withQuote('"').parse(in);
for (CSVRecord record : records) {
for (int i = 0; i < record.size(); i++) {
System.out.println("At " + i + ": " + record.get(i));
}
}
以及使用预定义的 EXCEL 格式:
Iterable<CSVRecord> records = CSVFormat.EXCEL.parse(in);
private void processOrderSet(HashMap<String, List<CSVRecord>> entries, FileWriter out, List<String> headers) throws IOException {
try (CSVPrinter printer = new CSVPrinter(out, CSVFormat.EXCEL.withHeader(headers.toArray(new String[0])).withQuote('"').withDelimiter(';')))
.....