从日期时间中提取日期并将结果用作同一 awk 脚本中的条件
Extract date from date time and used result as condition in the same awk script
这里是输入数据另存为input.csv
aNumber|bNumber|startDate|timeZone|duration|currencyType|cost|dicatedAccused|balanceAfter|trafficCase|teleServiceCode|location|dataVolume|numberOfEvents|fafIndicator|netWorkID|serviceProvideID|serviceClass|nAno|nBno|bNumberZnCode|fileNamedID|Destination|Operator|unknown3|MainAmount|ReAnalyse|DEDICATEDACCBALBEF|DEDICATEDACCBALAFT|ACCOUNTGROUPID|SERVICEOFFERINGS|SELECTEDCOMMUNITYID|BALANCEBEFORE
22677512549|778|2014-07-02 10:16:35.000|NULL|NULL|localCurrency|0.00|NULL|11.50|0|4|22676020076|NULL|NULL|NULL|NULL|NULL|34|77512549|778|NULL|1131257|OTHER|Short Code|126244088|0.0000|0|NULL|NULL|NULL|NULL|NULL|11.5000
22675557361|76457227|2014-07-02 10:16:38.000|NULL|NULL|localCurrency|10.00|NULL|1009.10|0|4|22676613028|NULL|NULL|1|NULL|NULL|35|75557361|76457227|NULL|1131257|Airtel|Airtel|4132206314|10.0000|0|NULL|NULL|NULL|NULL|NULL|1019.1000
22677521277|778|2014-07-02 10:16:42.000|NULL|NULL|localCurrency|0.00|NULL|0.00|0|4|22676020078|NULL|NULL|NULL|NULL|NULL|34|77521277|778|NULL|1131257|OTHER|Short Code|130071591|0.0000|0|NULL|NULL|NULL|NULL|NULL|0.0000
22676099496|77250331|2014-07-02 10:16:42.000|NULL|NULL|localCurrency|1.00|9|0.50|0|4|22676613028|NULL|NULL|NULL|NULL|NULL|35|76099496|77250331|NULL|1131257|Airtel|Airtel|4132218551|0.0000|0|4.0000|3.0000|NULL|NULL|NULL|0.5000
22667222160|22667262389|2014-07-02 10:16:43.000|NULL|NULL|localCurrency|10.00|NULL|16070.00|0|4|22676613028|NULL|NULL|NULL|NULL|NULL|35|67222160|67262389|NULL|1131257|Airtel|Airtel|4132222628|10.0000|0|NULL|NULL|NULL|NULL|NULL|16080.0000
22665799922|70110055|2014-07-02 10:16:45.000|NULL|NULL|localCurrency|20.00|6|0.50|0|4|22676020076|NULL|NULL|NULL|NULL|NULL|35|65799922|70110055|NULL|1131257|Telmob|Telmob|126260244|20.0000|0|44.0000|24.0000|NULL|NULL|NULL|0.5000
22676239633|433|2014-07-02 10:16:48.000|NULL|NULL|localCurrency|0.00|NULL|0.20|0|4|22676020027|NULL|NULL|NULL|NULL|NULL|35|76239633|433|NULL|1131257|Airtel_TollFree|Short Code|397224944|0.0000|0|NULL|NULL|NULL|NULL|NULL|0.2000
我必须按 date
、dicatedAccused
、trafficCase
和 teleserviceCode
分组,然后根据这个分组的结果我必须总结 duration
、cost
、balanceAfter
、MainAmount
和 Balancebefore
。我使用如下 awk 脚本:
这是我正在使用的 awk 脚本(另存为 test.awk
):
BEGIN {FS=";"}NR == 1 {next}{key=sprintf("%10s %10s %12s %12s",,,,) duration[key] += cost[key] += bAfter[key] += main[key] += dedAccbBefore[key] += dedAccbAfter[key] += bBefore[key] += $NF}END {
printf "%-10s\t\t %10s %12s %12s %10s %10s %10s %10s %12s %12s %10s\n", "date","dAccused","TrafficCase","ServiceCode","Duration","Cost","BalanceAfter","MainAmount","DAcBlBefore","DAcBlAfter","BalanceBefore"
for (i in duration) {
printf "%-47s %10s %10s %10s %10s %10s %10s\t %10s\n", i,duration[i],cost[i],bAfter[i],main[i],dedAccbBefore[i],dedAccbAfter[i],bBefore[i] }}
当我运行 awk 脚本时:
$ awk -f test.awk input.csv
我的输出是:
date dAccused TrafficCase ServiceCode Duration Cost BalanceAfter MainAmount DAcBlBefore DAcBlAfter BalanceBefore
2014-07-02 10:16:45.000 6 0 4 0 20 0.5 20 0 44 0.5
2014-07-02 10:16:42.000 NULL 0 4 0 0 0 0 0 0 0
2014-07-02 10:16:38.000 NULL 0 4 0 10 1009.1 10 0 0 1019.1
2014-07-02 10:16:42.000 9 0 4 0 1 0.5 0 0 4 0.5
2014-07-02 10:16:35.000 NULL 0 4 0 0 11.5 0 0 0 11.5
2014-07-02 10:16:43.000 NULL 0 4 0 10 16070 10 0 0 16080
2014-07-02 10:16:48.000 NULL 0 4 0 0 0.2 0 0 0 0.2
我担心的是我只想显示日期,而不是时间,因为它会破坏我的所有脚本。
我希望输出为:
date dAccused TrafficCase ServiceCode Duration Cost BalanceAfter MainAmount DAcBlBefore DAcBlAfter BalanceBefore
2014-07-02 6 0 4 0 20 0.5 20 0 44 0.5
2014-07-02 NULL 0 4 0 10 17090,8 20 0 0 17110,8
2014-07-02 9 0 4 0 1 0,5 0 0 4 0,5
任何想法都会很棒。
- 将
FS=";"
改为FS="|"
- 把
..}{key=sprintf("...
改成..}{sub(/ .*/,"",);key=sprintf(...
也就是加上sub(/ .*/,"",);
日期应符合要求的格式。
这里是输入数据另存为input.csv
aNumber|bNumber|startDate|timeZone|duration|currencyType|cost|dicatedAccused|balanceAfter|trafficCase|teleServiceCode|location|dataVolume|numberOfEvents|fafIndicator|netWorkID|serviceProvideID|serviceClass|nAno|nBno|bNumberZnCode|fileNamedID|Destination|Operator|unknown3|MainAmount|ReAnalyse|DEDICATEDACCBALBEF|DEDICATEDACCBALAFT|ACCOUNTGROUPID|SERVICEOFFERINGS|SELECTEDCOMMUNITYID|BALANCEBEFORE
22677512549|778|2014-07-02 10:16:35.000|NULL|NULL|localCurrency|0.00|NULL|11.50|0|4|22676020076|NULL|NULL|NULL|NULL|NULL|34|77512549|778|NULL|1131257|OTHER|Short Code|126244088|0.0000|0|NULL|NULL|NULL|NULL|NULL|11.5000
22675557361|76457227|2014-07-02 10:16:38.000|NULL|NULL|localCurrency|10.00|NULL|1009.10|0|4|22676613028|NULL|NULL|1|NULL|NULL|35|75557361|76457227|NULL|1131257|Airtel|Airtel|4132206314|10.0000|0|NULL|NULL|NULL|NULL|NULL|1019.1000
22677521277|778|2014-07-02 10:16:42.000|NULL|NULL|localCurrency|0.00|NULL|0.00|0|4|22676020078|NULL|NULL|NULL|NULL|NULL|34|77521277|778|NULL|1131257|OTHER|Short Code|130071591|0.0000|0|NULL|NULL|NULL|NULL|NULL|0.0000
22676099496|77250331|2014-07-02 10:16:42.000|NULL|NULL|localCurrency|1.00|9|0.50|0|4|22676613028|NULL|NULL|NULL|NULL|NULL|35|76099496|77250331|NULL|1131257|Airtel|Airtel|4132218551|0.0000|0|4.0000|3.0000|NULL|NULL|NULL|0.5000
22667222160|22667262389|2014-07-02 10:16:43.000|NULL|NULL|localCurrency|10.00|NULL|16070.00|0|4|22676613028|NULL|NULL|NULL|NULL|NULL|35|67222160|67262389|NULL|1131257|Airtel|Airtel|4132222628|10.0000|0|NULL|NULL|NULL|NULL|NULL|16080.0000
22665799922|70110055|2014-07-02 10:16:45.000|NULL|NULL|localCurrency|20.00|6|0.50|0|4|22676020076|NULL|NULL|NULL|NULL|NULL|35|65799922|70110055|NULL|1131257|Telmob|Telmob|126260244|20.0000|0|44.0000|24.0000|NULL|NULL|NULL|0.5000
22676239633|433|2014-07-02 10:16:48.000|NULL|NULL|localCurrency|0.00|NULL|0.20|0|4|22676020027|NULL|NULL|NULL|NULL|NULL|35|76239633|433|NULL|1131257|Airtel_TollFree|Short Code|397224944|0.0000|0|NULL|NULL|NULL|NULL|NULL|0.2000
我必须按 date
、dicatedAccused
、trafficCase
和 teleserviceCode
分组,然后根据这个分组的结果我必须总结 duration
、cost
、balanceAfter
、MainAmount
和 Balancebefore
。我使用如下 awk 脚本:
这是我正在使用的 awk 脚本(另存为 test.awk
):
BEGIN {FS=";"}NR == 1 {next}{key=sprintf("%10s %10s %12s %12s",,,,) duration[key] += cost[key] += bAfter[key] += main[key] += dedAccbBefore[key] += dedAccbAfter[key] += bBefore[key] += $NF}END {
printf "%-10s\t\t %10s %12s %12s %10s %10s %10s %10s %12s %12s %10s\n", "date","dAccused","TrafficCase","ServiceCode","Duration","Cost","BalanceAfter","MainAmount","DAcBlBefore","DAcBlAfter","BalanceBefore"
for (i in duration) {
printf "%-47s %10s %10s %10s %10s %10s %10s\t %10s\n", i,duration[i],cost[i],bAfter[i],main[i],dedAccbBefore[i],dedAccbAfter[i],bBefore[i] }}
当我运行 awk 脚本时:
$ awk -f test.awk input.csv
我的输出是:
date dAccused TrafficCase ServiceCode Duration Cost BalanceAfter MainAmount DAcBlBefore DAcBlAfter BalanceBefore
2014-07-02 10:16:45.000 6 0 4 0 20 0.5 20 0 44 0.5
2014-07-02 10:16:42.000 NULL 0 4 0 0 0 0 0 0 0
2014-07-02 10:16:38.000 NULL 0 4 0 10 1009.1 10 0 0 1019.1
2014-07-02 10:16:42.000 9 0 4 0 1 0.5 0 0 4 0.5
2014-07-02 10:16:35.000 NULL 0 4 0 0 11.5 0 0 0 11.5
2014-07-02 10:16:43.000 NULL 0 4 0 10 16070 10 0 0 16080
2014-07-02 10:16:48.000 NULL 0 4 0 0 0.2 0 0 0 0.2
我担心的是我只想显示日期,而不是时间,因为它会破坏我的所有脚本。
我希望输出为:
date dAccused TrafficCase ServiceCode Duration Cost BalanceAfter MainAmount DAcBlBefore DAcBlAfter BalanceBefore
2014-07-02 6 0 4 0 20 0.5 20 0 44 0.5
2014-07-02 NULL 0 4 0 10 17090,8 20 0 0 17110,8
2014-07-02 9 0 4 0 1 0,5 0 0 4 0,5
任何想法都会很棒。
- 将
FS=";"
改为FS="|"
- 把
..}{key=sprintf("...
改成..}{sub(/ .*/,"",);key=sprintf(...
也就是加上sub(/ .*/,"",);
日期应符合要求的格式。