读取现有的 csv 然后写回 CSV 将单元格中表示为双引号的英寸作为 "" 而不是 \"
Reading existing csv and then writing back to CSV puts inches expressed as double quotes in a cell as "" instead of \"
我有一个 CSV,它是通过构建 StringBuilder 并使用 PrintWriter 写入生成的。然后我再次阅读该 CSV 并向其附加一些内容,但它弄乱了其中带有双引号的单元格,用于表示英寸。
它打印双引号两次 15"
添加到 StringBuilder 的值之一是:
代码 1.1
String title = "Poly Nuclear 15\" Laptop Series Notebook Intel Windows10+ 7.6V Battery 8GB Memory"
Text t1 = new Text();
t1.setContent(title);
if (title.contains("\"")) {
t1.setContent("Poly Nuclear 15\\" Laptop Series Notebook Intel Windows10+ 7.6V Battery 8GB Memory");
}
我使用 PrintWriter 的第一个输出(在编写使用 StringBuilder 创建的逗号分隔字符串之后)是这样的:
OutputStreamWriter outputStreamWriter = new OutputStreamWriter(new FileOutputStream(filename, true), StandardCharsets.UTF_8);
PrintWriter printWriter = new PrintWriter(outputStreamWriter);
printWriter.println(stringBuilder.toString());
key,date,ms_id,title,alertId
190-2,2022-02-20 12:35:09,107193,Poly Nuclear 15" Laptop Series Notebook Intel Windows10+ 7.6V Battery 8GB Memory,
代码 1.2
现在我在每行的末尾添加最后一列的值 alertId
。我正在阅读并附加到每一行,然后按如下方式写回 CSV:
// Here below method is called as writeBack("1222") with fixed value.
public void writeBack(String value) {
String filePath = "/dir1/dir2/test.csv";
String key = "alertId"; // column name for which value needs to be added.
InputStreamReader inputStreamReader = new InputStreamReader(new
FileInputStream(filePath), StandardCharsets.UTF_8);
CSVReader reader = new CSVReader(inputStreamReader);
String[] header = reader.readNext();
int columnNum = Arrays.asList(header).indexOf(key);
List<String[]> feedData = reader.readAll();
try {
for (String[] row : feedData) {
row[columnNum] = value;
}
reader.close();
OutputStreamWriter outputStreamWriter = new OutputStreamWriter(new FileOutputStream(filePath), StandardCharsets.UTF_8);
CSVWriter writer = new CSVWriter(outputStreamWriter);
writer.writeNext(header);
writer.writeAll(feedData);
writer.flush();
writer.close();
} catch (Exception e) {
writeLog("ERROR", e);
}
}
我的最终输出是这样的,除了字符串值有双引号 15""
之外,一切都是正确的
"key","date","ms_id","title","alertId"
"190-2","2022-02-20 12:35:09","107193","Poly Nuclear 15"" Laptop Series Notebook Intel Windows10+ 7.6V Battery 8GB Memory","1222"
如何避免最终输出中表示英寸的单元格中出现双引号?
预期输出
"key","date","ms_id","title","alertId"
"190-2","2022-02-20 12:35:09","107193","Poly Nuclear 15\" Laptop Series Notebook Intel Windows10+ 7.6V Battery 8GB Memory","1222"
感谢@Mark Rotteveel 的指针提示。这有助于我寻找不同的分隔符并转义更多字符。
意识到 CSVReader 和 CSVWriter 也有不同的转义字符。
我终于用下面的方法解决了:
import org.apache.commons.lang.StringEscapeUtils;
...
StringEscapeUtils.escapeCsv("Poly Nuclear 15\" Laptop Series, Notebook \ Intel Windows10+ 7.6V Battery 8GB Memory")
并在写作时使用它:
import com.opencsv.ICSVWriter;
...
char escapeChar = '\';
CSVWriter writer = new CSVWriter(outputStreamWriter, ICSVWriter.DEFAULT_SEPARATOR, ICSVWriter.DEFAULT_QUOTE_CHARACTER, escapeChar, ICSVWriter.DEFAULT_LINE_END);
我有一个 CSV,它是通过构建 StringBuilder 并使用 PrintWriter 写入生成的。然后我再次阅读该 CSV 并向其附加一些内容,但它弄乱了其中带有双引号的单元格,用于表示英寸。
它打印双引号两次 15"
添加到 StringBuilder 的值之一是:
代码 1.1
String title = "Poly Nuclear 15\" Laptop Series Notebook Intel Windows10+ 7.6V Battery 8GB Memory"
Text t1 = new Text();
t1.setContent(title);
if (title.contains("\"")) {
t1.setContent("Poly Nuclear 15\\" Laptop Series Notebook Intel Windows10+ 7.6V Battery 8GB Memory");
}
我使用 PrintWriter 的第一个输出(在编写使用 StringBuilder 创建的逗号分隔字符串之后)是这样的:
OutputStreamWriter outputStreamWriter = new OutputStreamWriter(new FileOutputStream(filename, true), StandardCharsets.UTF_8);
PrintWriter printWriter = new PrintWriter(outputStreamWriter);
printWriter.println(stringBuilder.toString());
key,date,ms_id,title,alertId
190-2,2022-02-20 12:35:09,107193,Poly Nuclear 15" Laptop Series Notebook Intel Windows10+ 7.6V Battery 8GB Memory,
代码 1.2
现在我在每行的末尾添加最后一列的值 alertId
。我正在阅读并附加到每一行,然后按如下方式写回 CSV:
// Here below method is called as writeBack("1222") with fixed value.
public void writeBack(String value) {
String filePath = "/dir1/dir2/test.csv";
String key = "alertId"; // column name for which value needs to be added.
InputStreamReader inputStreamReader = new InputStreamReader(new
FileInputStream(filePath), StandardCharsets.UTF_8);
CSVReader reader = new CSVReader(inputStreamReader);
String[] header = reader.readNext();
int columnNum = Arrays.asList(header).indexOf(key);
List<String[]> feedData = reader.readAll();
try {
for (String[] row : feedData) {
row[columnNum] = value;
}
reader.close();
OutputStreamWriter outputStreamWriter = new OutputStreamWriter(new FileOutputStream(filePath), StandardCharsets.UTF_8);
CSVWriter writer = new CSVWriter(outputStreamWriter);
writer.writeNext(header);
writer.writeAll(feedData);
writer.flush();
writer.close();
} catch (Exception e) {
writeLog("ERROR", e);
}
}
我的最终输出是这样的,除了字符串值有双引号 15""
"key","date","ms_id","title","alertId"
"190-2","2022-02-20 12:35:09","107193","Poly Nuclear 15"" Laptop Series Notebook Intel Windows10+ 7.6V Battery 8GB Memory","1222"
如何避免最终输出中表示英寸的单元格中出现双引号?
预期输出
"key","date","ms_id","title","alertId"
"190-2","2022-02-20 12:35:09","107193","Poly Nuclear 15\" Laptop Series Notebook Intel Windows10+ 7.6V Battery 8GB Memory","1222"
感谢@Mark Rotteveel 的指针提示。这有助于我寻找不同的分隔符并转义更多字符。
意识到 CSVReader 和 CSVWriter 也有不同的转义字符。
我终于用下面的方法解决了:
import org.apache.commons.lang.StringEscapeUtils;
...
StringEscapeUtils.escapeCsv("Poly Nuclear 15\" Laptop Series, Notebook \ Intel Windows10+ 7.6V Battery 8GB Memory")
并在写作时使用它:
import com.opencsv.ICSVWriter;
...
char escapeChar = '\';
CSVWriter writer = new CSVWriter(outputStreamWriter, ICSVWriter.DEFAULT_SEPARATOR, ICSVWriter.DEFAULT_QUOTE_CHARACTER, escapeChar, ICSVWriter.DEFAULT_LINE_END);