Shell 准备未格式化数据的脚本
Shell Script to prepare unformatted data
我有文本文件 TEST.txt,其中包含以下未格式化的数据:
0411 14:30:00 INF[baag.reporting.main.Logss.ExecuteLogsRunnable] Executing cron report Freigabe 14:30 for cron job Freigabe 14:30 for TRE_ClientServiceGroup@TEST.fs, Businesspartner@TEST.fs
0411 14:30:02 INF[baag.reporting.main.Logss.ExecuteLogsRunnable] Freigaben had no results
0411 14:30:02 INF[baag.reporting.main.Logss.ExecuteLogsRunnable] Freigabe 14:30 NOT sent to TRE_ClientServiceGroup@TEST.fs, Businesspartner@TEST.fs since all reports were empty and empty reports should not be send
0411 17:03:14 INF[baag.reporting.db.DataSourceMapFactory] Datasource [itraderdbint] has been added to datasource map
0411 17:03:14 INF[baag.reporting.db.DataSourceMapFactory] Datasource [otc_sv2599] has been added to datasource map
0411 17:03:14 INF[baag.reporting.db.DataSourceMapFactory] Datasource [qlp_devp] has been added to datasource map
0411 17:03:15 INF[baag.reporting.main.Logss.QuarzLogsManager] Added Trigger for QUARTZ that fires next on Tue Apr 13 08:00:00 CEST 2021 for Logs Compliance MAR Crossingprüfung/Frontrunning DI-FR
0411 17:03:15 INF[baag.reporting.main.Logss.QuarzLogsManager] Added Trigger for QUARTZ that fires next on Tue Apr 13 08:20:00 CEST 2021 for Logs Compliance OR Umsatzstatistik DI-FR
0411 17:03:15 INF[baag.reporting.main.Logss.QuarzLogsManager] Added Trigger for QUARTZ that fires next on Mon Apr 12 08:20:00 CEST 2021 for Logs Compliance OR Umsatzstatistik MO
现在我想创建 Shell 脚本,它将此未格式化的数据准备为以下格式并创建例如 PrepardFile.txt
。我想用管道运算符分隔每个字符串。第一部分是日期格式,所以我希望它是完整的字符串。第二部分始终以 INF[
开头并以 ]
结尾,或者我们可以从 INF[
开始获取没有空格的完整部分,这将是我作为管道运算符分隔的第二个字符串。第三部分将是剩下的部分,这将是我的第三根弦。我想添加 header 以便更好地理解此字段值表示什么:
DATE_FORMAT|ROW_EXECUTE|ROW_VALUE
0411 14:30:00|INF[baag.reporting.main.Logss.ExecuteLogsRunnable]|Executing cron report Freigabe 14:30 for cron job Freigabe 14:30 for TRE_ClientServiceGroup@TEST.fs, Businesspartner@TEST.fs
0411 14:30:02|INF[baag.reporting.main.Logss.ExecuteLogsRunnable]|Freigaben had no results
0411 14:30:02|INF[baag.reporting.main.Logss.ExecuteLogsRunnable]|Freigabe 14:30 NOT sent to TRE_ClientServiceGroup@TEST.fs, Businesspartner@TEST.fs since all reports were empty and empty reports should not be send
0411 17:03:14|INF[baag.reporting.db.DataSourceMapFactory]|Datasource [itraderdbint] has been added to datasource map
0411 17:03:14|INF[baag.reporting.db.DataSourceMapFactory]|Datasource [otc_sv2599] has been added to datasource map
0411 17:03:14|INF[baag.reporting.db.DataSourceMapFactory]|Datasource [qlp_devp] has been added to datasource map
0411 17:03:15|INF[baag.reporting.main.Logss.QuarzLogsManager]|Added Trigger for QUARTZ that fires next on Tue Apr 13 08:00:00 CEST 2021 for Logs Compliance MAR Crossingprüfung/Frontrunning DI-FR
0411 17:03:15|INF[baag.reporting.main.Logss.QuarzLogsManager]|Added Trigger for QUARTZ that fires next on Tue Apr 13 08:20:00 CEST 2021 for Logs Compliance OR Umsatzstatistik DI-FR
0411 17:03:15|INF[baag.reporting.main.Logss.QuarzLogsManager]|Added Trigger for QUARTZ that fires next on Mon Apr 12 08:20:00 CEST 2021 for Logs Compliance OR Umsatzstatistik MO
我是 Shell 脚本的新手,不知道这是否可以在 shell 脚本的帮助下完成。
您可以使用下面的 Shell 脚本,看看是否有帮助。它使用 sed
命令和管道组合首先替换第二次出现的 space,然后替换右方括号。
cat TEST.txt | sed 's/ /|/2' | sed 's/] /]|/1' > PreparedFile.txt
@Symonds
此回复是关于您要求添加 header 部分和进一步解释的评论。
要添加 header 部分,您可以使用 echo
并先创建 PreparedFile.txt
。然后使用 >>
运算符附加到文件。您可以将完整代码复制到名为 Script.sh
的文件,然后使用 bash Script.sh
运行
#!/bin/bash
echo "DATE_FORMAT|ROW_EXECUTE|ROW_VALUE" > PreparedFile.txt
cat TEST.txt | sed 's/ /|/2' | sed 's/] /]|/1' >> PreparedFile.txt
就您要求的解释而言,您可以使用管道符号 |
链接命令。 sed
命令允许您用替换项替换指定的正则表达式。在 cat
命令之后的第一个管道中,我使用 s/ /|/2
。这意味着将第二次出现的空白 space 替换为 |
。您可以阅读有关 sed
命令用法的更多信息 here.
我有文本文件 TEST.txt,其中包含以下未格式化的数据:
0411 14:30:00 INF[baag.reporting.main.Logss.ExecuteLogsRunnable] Executing cron report Freigabe 14:30 for cron job Freigabe 14:30 for TRE_ClientServiceGroup@TEST.fs, Businesspartner@TEST.fs
0411 14:30:02 INF[baag.reporting.main.Logss.ExecuteLogsRunnable] Freigaben had no results
0411 14:30:02 INF[baag.reporting.main.Logss.ExecuteLogsRunnable] Freigabe 14:30 NOT sent to TRE_ClientServiceGroup@TEST.fs, Businesspartner@TEST.fs since all reports were empty and empty reports should not be send
0411 17:03:14 INF[baag.reporting.db.DataSourceMapFactory] Datasource [itraderdbint] has been added to datasource map
0411 17:03:14 INF[baag.reporting.db.DataSourceMapFactory] Datasource [otc_sv2599] has been added to datasource map
0411 17:03:14 INF[baag.reporting.db.DataSourceMapFactory] Datasource [qlp_devp] has been added to datasource map
0411 17:03:15 INF[baag.reporting.main.Logss.QuarzLogsManager] Added Trigger for QUARTZ that fires next on Tue Apr 13 08:00:00 CEST 2021 for Logs Compliance MAR Crossingprüfung/Frontrunning DI-FR
0411 17:03:15 INF[baag.reporting.main.Logss.QuarzLogsManager] Added Trigger for QUARTZ that fires next on Tue Apr 13 08:20:00 CEST 2021 for Logs Compliance OR Umsatzstatistik DI-FR
0411 17:03:15 INF[baag.reporting.main.Logss.QuarzLogsManager] Added Trigger for QUARTZ that fires next on Mon Apr 12 08:20:00 CEST 2021 for Logs Compliance OR Umsatzstatistik MO
现在我想创建 Shell 脚本,它将此未格式化的数据准备为以下格式并创建例如 PrepardFile.txt
。我想用管道运算符分隔每个字符串。第一部分是日期格式,所以我希望它是完整的字符串。第二部分始终以 INF[
开头并以 ]
结尾,或者我们可以从 INF[
开始获取没有空格的完整部分,这将是我作为管道运算符分隔的第二个字符串。第三部分将是剩下的部分,这将是我的第三根弦。我想添加 header 以便更好地理解此字段值表示什么:
DATE_FORMAT|ROW_EXECUTE|ROW_VALUE
0411 14:30:00|INF[baag.reporting.main.Logss.ExecuteLogsRunnable]|Executing cron report Freigabe 14:30 for cron job Freigabe 14:30 for TRE_ClientServiceGroup@TEST.fs, Businesspartner@TEST.fs
0411 14:30:02|INF[baag.reporting.main.Logss.ExecuteLogsRunnable]|Freigaben had no results
0411 14:30:02|INF[baag.reporting.main.Logss.ExecuteLogsRunnable]|Freigabe 14:30 NOT sent to TRE_ClientServiceGroup@TEST.fs, Businesspartner@TEST.fs since all reports were empty and empty reports should not be send
0411 17:03:14|INF[baag.reporting.db.DataSourceMapFactory]|Datasource [itraderdbint] has been added to datasource map
0411 17:03:14|INF[baag.reporting.db.DataSourceMapFactory]|Datasource [otc_sv2599] has been added to datasource map
0411 17:03:14|INF[baag.reporting.db.DataSourceMapFactory]|Datasource [qlp_devp] has been added to datasource map
0411 17:03:15|INF[baag.reporting.main.Logss.QuarzLogsManager]|Added Trigger for QUARTZ that fires next on Tue Apr 13 08:00:00 CEST 2021 for Logs Compliance MAR Crossingprüfung/Frontrunning DI-FR
0411 17:03:15|INF[baag.reporting.main.Logss.QuarzLogsManager]|Added Trigger for QUARTZ that fires next on Tue Apr 13 08:20:00 CEST 2021 for Logs Compliance OR Umsatzstatistik DI-FR
0411 17:03:15|INF[baag.reporting.main.Logss.QuarzLogsManager]|Added Trigger for QUARTZ that fires next on Mon Apr 12 08:20:00 CEST 2021 for Logs Compliance OR Umsatzstatistik MO
我是 Shell 脚本的新手,不知道这是否可以在 shell 脚本的帮助下完成。
您可以使用下面的 Shell 脚本,看看是否有帮助。它使用 sed
命令和管道组合首先替换第二次出现的 space,然后替换右方括号。
cat TEST.txt | sed 's/ /|/2' | sed 's/] /]|/1' > PreparedFile.txt
@Symonds
此回复是关于您要求添加 header 部分和进一步解释的评论。
要添加 header 部分,您可以使用 echo
并先创建 PreparedFile.txt
。然后使用 >>
运算符附加到文件。您可以将完整代码复制到名为 Script.sh
的文件,然后使用 bash Script.sh
#!/bin/bash
echo "DATE_FORMAT|ROW_EXECUTE|ROW_VALUE" > PreparedFile.txt
cat TEST.txt | sed 's/ /|/2' | sed 's/] /]|/1' >> PreparedFile.txt
就您要求的解释而言,您可以使用管道符号 |
链接命令。 sed
命令允许您用替换项替换指定的正则表达式。在 cat
命令之后的第一个管道中,我使用 s/ /|/2
。这意味着将第二次出现的空白 space 替换为 |
。您可以阅读有关 sed
命令用法的更多信息 here.