Shell 准备未格式化数据的脚本

Shell Script to prepare unformatted data

我有文本文件 TEST.txt,其中包含以下未格式化的数据:

0411 14:30:00 INF[baag.reporting.main.Logss.ExecuteLogsRunnable] Executing cron report Freigabe 14:30 for cron job Freigabe 14:30 for TRE_ClientServiceGroup@TEST.fs, Businesspartner@TEST.fs
0411 14:30:02 INF[baag.reporting.main.Logss.ExecuteLogsRunnable] Freigaben had no results
0411 14:30:02 INF[baag.reporting.main.Logss.ExecuteLogsRunnable] Freigabe 14:30 NOT sent to TRE_ClientServiceGroup@TEST.fs, Businesspartner@TEST.fs since all reports were empty and empty reports should not be send
0411 17:03:14 INF[baag.reporting.db.DataSourceMapFactory] Datasource [itraderdbint] has been added to datasource map
0411 17:03:14 INF[baag.reporting.db.DataSourceMapFactory] Datasource [otc_sv2599] has been added to datasource map
0411 17:03:14 INF[baag.reporting.db.DataSourceMapFactory] Datasource [qlp_devp] has been added to datasource map
0411 17:03:15 INF[baag.reporting.main.Logss.QuarzLogsManager] Added Trigger for QUARTZ that fires next on Tue Apr 13 08:00:00 CEST 2021 for Logs Compliance MAR Crossingprüfung/Frontrunning DI-FR
0411 17:03:15 INF[baag.reporting.main.Logss.QuarzLogsManager] Added Trigger for QUARTZ that fires next on Tue Apr 13 08:20:00 CEST 2021 for Logs Compliance OR Umsatzstatistik DI-FR
0411 17:03:15 INF[baag.reporting.main.Logss.QuarzLogsManager] Added Trigger for QUARTZ that fires next on Mon Apr 12 08:20:00 CEST 2021 for Logs Compliance OR Umsatzstatistik MO

现在我想创建 Shell 脚本,它将此未格式化的数据准备为以下格式并创建例如 PrepardFile.txt。我想用管道运算符分隔每个字符串。第一部分是日期格式,所以我希望它是完整的字符串。第二部分始终以 INF[ 开头并以 ] 结尾,或者我们可以从 INF[ 开始获取没有空格的完整部分,这将是我作为管道运算符分隔的第二个字符串。第三部分将是剩下的部分,这将是我的第三根弦。我想添加 header 以便更好地理解此字段值表示什么:

DATE_FORMAT|ROW_EXECUTE|ROW_VALUE
0411 14:30:00|INF[baag.reporting.main.Logss.ExecuteLogsRunnable]|Executing cron report Freigabe 14:30 for cron job Freigabe 14:30 for TRE_ClientServiceGroup@TEST.fs, Businesspartner@TEST.fs
0411 14:30:02|INF[baag.reporting.main.Logss.ExecuteLogsRunnable]|Freigaben had no results
0411 14:30:02|INF[baag.reporting.main.Logss.ExecuteLogsRunnable]|Freigabe 14:30 NOT sent to TRE_ClientServiceGroup@TEST.fs, Businesspartner@TEST.fs since all reports were empty and empty reports should not be send
0411 17:03:14|INF[baag.reporting.db.DataSourceMapFactory]|Datasource [itraderdbint] has been added to datasource map
0411 17:03:14|INF[baag.reporting.db.DataSourceMapFactory]|Datasource [otc_sv2599] has been added to datasource map
0411 17:03:14|INF[baag.reporting.db.DataSourceMapFactory]|Datasource [qlp_devp] has been added to datasource map
0411 17:03:15|INF[baag.reporting.main.Logss.QuarzLogsManager]|Added Trigger for QUARTZ that fires next on Tue Apr 13 08:00:00 CEST 2021 for Logs Compliance MAR Crossingprüfung/Frontrunning DI-FR
0411 17:03:15|INF[baag.reporting.main.Logss.QuarzLogsManager]|Added Trigger for QUARTZ that fires next on Tue Apr 13 08:20:00 CEST 2021 for Logs Compliance OR Umsatzstatistik DI-FR
0411 17:03:15|INF[baag.reporting.main.Logss.QuarzLogsManager]|Added Trigger for QUARTZ that fires next on Mon Apr 12 08:20:00 CEST 2021 for Logs Compliance OR Umsatzstatistik MO

我是 Shell 脚本的新手,不知道这是否可以在 shell 脚本的帮助下完成。

您可以使用下面的 Shell 脚本,看看是否有帮助。它使用 sed 命令和管道组合首先替换第二次出现的 space,然后替换右方括号。

cat TEST.txt | sed 's/ /|/2' | sed 's/] /]|/1' > PreparedFile.txt

@Symonds

此回复是关于您要求添加 header 部分和进一步解释的评论。

要添加 header 部分,您可以使用 echo 并先创建 PreparedFile.txt。然后使用 >> 运算符附加到文件。您可以将完整代码复制到名为 Script.sh 的文件,然后使用 bash Script.sh

运行
#!/bin/bash
echo "DATE_FORMAT|ROW_EXECUTE|ROW_VALUE" >  PreparedFile.txt
cat TEST.txt | sed 's/ /|/2' | sed 's/] /]|/1' >> PreparedFile.txt

就您要求的解释而言,您可以使用管道符号 | 链接命令。 sed 命令允许您用替换项替换指定的正则表达式。在 cat 命令之后的第一个管道中,我使用 s/ /|/2。这意味着将第二次出现的空白 space 替换为 |。您可以阅读有关 sed 命令用法的更多信息 here.