如何使用单行 bash 命令替换非标准 CSV 文件引用字段中的引号？

Question

我有这样一个文件：

col1×col2×col3
12×"Some field with "quotes" inside it"×"Some field without quotes inside but with new lines \n"

我想用单引号替换内部双引号，这样结果将如下所示：

col1×col2×col3
12×"Some field with 'quotes' inside it"×"Some field without quotes inside but with new lines \n"

我想这可以用 sed、awk 或 ex 来完成，但我一直没能想出一个干净快捷的方法来完成它。真正的 CSV 文件是数百万行的数量级。

首选解决方案是使用上述程序的单线。

Answer 1

基于您的字段分隔符 ×，使用 sed 的简单解决方法可能是：

 sed -E "s/([^×])\"([^×])/'/g" file

这会用 '.

替换每个 " 之前和之后的除 × 之外的任何字符

注意 sed 不支持前瞻，所以我们必须分组并重新插入模式。

How to replace quotes inside a quoted field of a non-standard CSV file using a one-liner bash command?