Sed/awk 用于将 shell 中的 csv 列的字符串转换为整数
Sed/awk for String to integer conversion of a csv column in shell
我需要将 csv 文件的第 7 列从浮点数转换为十进制数。这是一个巨大的文件,我不想在读取时使用它进行转换。有 awk 的快捷方式吗?
输入:
"xx","x","xxxxxx","xxx","xx","xx"," 00000001.0000"
"xx","x","xxxxxx","xxx","xx","xx"," 00000002.0000"
"xx","x","xxxxxx","xxx","xx","xx"," 00000005.0000"
"xx","x","xxxxxx","xxx","xx","xx"," 00000011.0000"
输出:
"xx","x","xxxxxx","xxx","xx","xx","1"
"xx","x","xxxxxx","xxx","xx","xx","2"
"xx","x","xxxxxx","xxx","xx","xx","5"
"xx","x","xxxxxx","xxx","xx","xx","11"
试过这些,成功了。但是还有更简单的吗?
awk 'BEGIN {FS=OFS="\",\""} { = sprintf("%.0f", )} 1' $test > $test1
awk '{printf("%s\"\n", [=14=])}' $test1
awk 'BEGIN{FS=OFS=","} {gsub(/"/, "", ); ="\"" +0 "\""; print}' file
输出:
"xx","x","xxxxxx","xxx","xx","xx","1"
"xx","x","xxxxxx","xxx","xx","xx","2"
"xx","x","xxxxxx","xxx","xx","xx","5"
"xx","x","xxxxxx","xxx","xx","xx","11"
gsub(/"/, "", )
: removes all "
from
+0
: Reduces the number in to minimal representation
使用您展示的示例,请尝试执行以下 awk
程序。
awk -v s1="\"" -v OFS="," '{$NF = s1 ($NF + 0) s1} 1' Input_file
解释: 简单的解释就是,在主程序中设置OFS
为,
;在每一行的最后一个字段中只保留数字并用 "
覆盖最后一个字段,重新排列字段并打印 edited/non-edited 所有行。
另一个简单的awk
解决方案:
awk 'BEGIN {FS=OFS="\",\""} {$NF = $NF+0 "\""} 1' file
"xx","x","xxxxxx","xxx","xx","xx","1"
"xx","x","xxxxxx","xxx","xx","xx","2"
"xx","x","xxxxxx","xxx","xx","xx","5"
"xx","x","xxxxxx","xxx","xx","xx","11"
我需要将 csv 文件的第 7 列从浮点数转换为十进制数。这是一个巨大的文件,我不想在读取时使用它进行转换。有 awk 的快捷方式吗?
输入:
"xx","x","xxxxxx","xxx","xx","xx"," 00000001.0000"
"xx","x","xxxxxx","xxx","xx","xx"," 00000002.0000"
"xx","x","xxxxxx","xxx","xx","xx"," 00000005.0000"
"xx","x","xxxxxx","xxx","xx","xx"," 00000011.0000"
输出:
"xx","x","xxxxxx","xxx","xx","xx","1"
"xx","x","xxxxxx","xxx","xx","xx","2"
"xx","x","xxxxxx","xxx","xx","xx","5"
"xx","x","xxxxxx","xxx","xx","xx","11"
试过这些,成功了。但是还有更简单的吗?
awk 'BEGIN {FS=OFS="\",\""} { = sprintf("%.0f", )} 1' $test > $test1
awk '{printf("%s\"\n", [=14=])}' $test1
awk 'BEGIN{FS=OFS=","} {gsub(/"/, "", ); ="\"" +0 "\""; print}' file
输出:
"xx","x","xxxxxx","xxx","xx","xx","1" "xx","x","xxxxxx","xxx","xx","xx","2" "xx","x","xxxxxx","xxx","xx","xx","5" "xx","x","xxxxxx","xxx","xx","xx","11"
gsub(/"/, "", )
: removes all"
from
+0
: Reduces the number in to minimal representation
使用您展示的示例,请尝试执行以下 awk
程序。
awk -v s1="\"" -v OFS="," '{$NF = s1 ($NF + 0) s1} 1' Input_file
解释: 简单的解释就是,在主程序中设置OFS
为,
;在每一行的最后一个字段中只保留数字并用 "
覆盖最后一个字段,重新排列字段并打印 edited/non-edited 所有行。
另一个简单的awk
解决方案:
awk 'BEGIN {FS=OFS="\",\""} {$NF = $NF+0 "\""} 1' file
"xx","x","xxxxxx","xxx","xx","xx","1"
"xx","x","xxxxxx","xxx","xx","xx","2"
"xx","x","xxxxxx","xxx","xx","xx","5"
"xx","x","xxxxxx","xxx","xx","xx","11"