如何使用 shell 脚本去除 csv 输出中的日期？

Question

我有一些 csv 摘录正试图确定日期，它们如下：

"Time Stamp","DBUID"
2016-11-25T08:28:33.000-8:00,"5tSSMImFjIkT0FpiO16LuA"

第一列始终是 "Time Stamp"，我想将其转换为仅保留日期“2016-11-25”并删除 "T08:28:33.000-8:00"。

最终结果会是..

"Time Stamp","DBUID"
2016-11-25,"5tSSMImFjIkT0FpiO16LuA"

有很多不同日期的文件。

有没有办法在 ksh 中做到这一点？某种 for each 循环遍历所有文件并替换长时间戳并仅保留日期？

Answer 1

这是一个使用标准 aix 实用程序的解决方案，

awk -F, -v OFS=, 'NR>1{sub(/T.*$/,"",)}1' file > file.cln && mv file.cln file

输出

"Time Stamp","DBUID"
2016-11-25,"5tSSMImFjIkT0FpiO16LuA"

（但我无法再访问 aix 环境，因此仅在我的本地 awk 上进行了测试）。

NR>1 跳过 header 行，并且 sub() 仅限于第一个字段（直到第一个逗号）。对于 {print [=17=]}，尾随的 1 字符是 awk shorthand。

如果您的数据布局发生变化并且您的数据中出现额外的逗号，这可能需要修复。

IHTH

Answer 2

使用sed：

sed -i "s/\([0-9]\{4\}\)-\([0-9]\{2\}\)-\([0-9]\{2\}\).*,/--,/" file.csv

输出：

"Time Stamp","DBUID"
2016-11-25,"5tSSMImFjIkT0FpiO16LuA"

-i 就地编辑文件

s 替换

Answer 3

使用 sed:

$ sed '2,$s/T[^,]*//' file
"Time Stamp","DBUID"
2016-11-25,"5tSSMImFjIkT0FpiO16LuA"

工作原理：

2,$           # Skip header (first line) removing this will make a
              # replacement on the first line as well.
   s/T[^,]*// # Replace everything between T (inclusive) and , (exclusive)
              # `[^,]*' Matches everything but `,' zero or more times

Answer 4

这对 awk 来说是一个完美的工作，但与之前的答案不同，我建议使用 substring 功能。

awk -F, 'NR > 1{ = substr(,1,10)} {print [=10=]}' file.txt

说明

-F,: The -F flag sets a field separator, in this case a comma

NR > 1: Ignore the first row

: Refers to the first field

= substr(,1,10): Sets the first field to the first 10 characters of the field. In the example, this is the date portion

print [=13=]: This will print the entire row

如何使用 shell 脚本去除 csv 输出中的日期？

How to strip date in csv output using shell script?

bash

aix

ksh