按天拆分 ~200mb log4j 日志文件
Split ~200mb log4j log file by day
我有一个格式如下的日志文件,我想按天将它分成多个文件(即 log-2017-10-2、log-2017-10-3 等)。我见过人们用 awk 来做,但我不确定如何处理堆栈跟踪,因为 java.io.Exception 是一个新行。有什么方便的方法可以实现吗?
2017-10-02 04:26:02,534 INFO XXXXXXXXXXXXXXXXX
2017-10-03 04:26:02,543 INFO XXXXXXXXXXXX
2017-10-04 04:26:02,544 INFO XXXXXXXXX
2017-10-04 04:26:02,546 INFO XXXXXXXXXXXXX
2017-10-04 04:26:02,549 INFO XXXXXXXXXXX
2017-10-04 04:53:02,787 WARN class.class.class: [FetcherXXXXXX], Error in fetch XXXXXXXXXXXXXXXXXXXXXX
java.io.IOException: Connection to X was disconnected before the response was read
at XXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXX
2017-10-05 04:26:02,549 INFO XXXXXXXXXXX
最终文件内容为:
log-2017-10-2:
2017-10-02 04:26:02,534 INFO XXXXXXXXXXXXXXXXX
log-2017-10-3:
2017-10-03 04:26:02,543 INFO XXXXXXXXXXXX
log-2017-10-4:
2017-10-04 04:26:02,544 INFO XXXXXXXXX
2017-10-04 04:26:02,546 INFO XXXXXXXXXXXXX
2017-10-04 04:26:02,549 INFO XXXXXXXXXXX
2017-10-04 04:53:02,787 WARN class.class.class: [FetcherXXXXXX], Error in fetch XXXXXXXXXXXXXXXXXXXXXX
java.io.IOException: Connection to X was disconnected before the response was read
at XXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXX
log-2017-10-5:
2017-10-05 04:26:02,549 INFO XXXXXXXXXXX
awk
救援!
$ awk --posix 'BEGIN{f="log-header"}
~/^[0-9]{4}-[0-9]{2}-[0-9]{2}$/{f="log-"} {print > f}' log
如果日期太多(对应打开的文件太多),您可能需要一次性关闭文件。对于几百个,它应该按原样工作。
设置初始日志文件 (log-header) 以防您的日志不是以选中的正则表达式开头。
awk
解法:
awk '/^[0-9]{4}-[0-9]{2}-[0-9]{2} /{
if (fn && !a[]++) close(fn);
fn="log-"
}{ print > fn }' logfile
/^[0-9]{4}-[0-9]{2}-[0-9]{2} /
- 在遇到以日期字符串 开头的行时
if(fn && !a[]++) close(fn)
- 为前一个 "date" 关闭前一个打开的文件描述符
fn="log-"
- 构造文件名
查看结果:
$ head log-*
==> log-2017-10-02 <==
2017-10-02 04:26:02,534 INFO XXXXXXXXXXXXXXXXX
==> log-2017-10-03 <==
2017-10-03 04:26:02,543 INFO XXXXXXXXXXXX
==> log-2017-10-04 <==
2017-10-04 04:26:02,544 INFO XXXXXXXXX
2017-10-04 04:26:02,546 INFO XXXXXXXXXXXXX
2017-10-04 04:26:02,549 INFO XXXXXXXXXXX
2017-10-04 04:53:02,787 WARN class.class.class: [FetcherXXXXXX], Error in fetch XXXXXXXXXXXXXXXXXXXXXX
java.io.IOException: Connection to X was disconnected before the response was read
&XXXXXXXXXXXXXXXXXXXX
&XXXXXXXXXXXXXXXXXXXX
&XXXXXXXXXXXXXXXXXXXXX
&XXXXXXXXXXXXXXXX
&XXXXXXXXXXXXXXXX
==> log-2017-10-05 <==
2017-10-05 04:26:02,549 INFO XXXXXXXXXXX
我有一个格式如下的日志文件,我想按天将它分成多个文件(即 log-2017-10-2、log-2017-10-3 等)。我见过人们用 awk 来做,但我不确定如何处理堆栈跟踪,因为 java.io.Exception 是一个新行。有什么方便的方法可以实现吗?
2017-10-02 04:26:02,534 INFO XXXXXXXXXXXXXXXXX
2017-10-03 04:26:02,543 INFO XXXXXXXXXXXX
2017-10-04 04:26:02,544 INFO XXXXXXXXX
2017-10-04 04:26:02,546 INFO XXXXXXXXXXXXX
2017-10-04 04:26:02,549 INFO XXXXXXXXXXX
2017-10-04 04:53:02,787 WARN class.class.class: [FetcherXXXXXX], Error in fetch XXXXXXXXXXXXXXXXXXXXXX
java.io.IOException: Connection to X was disconnected before the response was read
at XXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXX
2017-10-05 04:26:02,549 INFO XXXXXXXXXXX
最终文件内容为:
log-2017-10-2:
2017-10-02 04:26:02,534 INFO XXXXXXXXXXXXXXXXX
log-2017-10-3:
2017-10-03 04:26:02,543 INFO XXXXXXXXXXXX
log-2017-10-4:
2017-10-04 04:26:02,544 INFO XXXXXXXXX
2017-10-04 04:26:02,546 INFO XXXXXXXXXXXXX
2017-10-04 04:26:02,549 INFO XXXXXXXXXXX
2017-10-04 04:53:02,787 WARN class.class.class: [FetcherXXXXXX], Error in fetch XXXXXXXXXXXXXXXXXXXXXX
java.io.IOException: Connection to X was disconnected before the response was read
at XXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXX
at XXXXXXXXXXXXXXXX
log-2017-10-5:
2017-10-05 04:26:02,549 INFO XXXXXXXXXXX
awk
救援!
$ awk --posix 'BEGIN{f="log-header"}
~/^[0-9]{4}-[0-9]{2}-[0-9]{2}$/{f="log-"} {print > f}' log
如果日期太多(对应打开的文件太多),您可能需要一次性关闭文件。对于几百个,它应该按原样工作。
设置初始日志文件 (log-header) 以防您的日志不是以选中的正则表达式开头。
awk
解法:
awk '/^[0-9]{4}-[0-9]{2}-[0-9]{2} /{
if (fn && !a[]++) close(fn);
fn="log-"
}{ print > fn }' logfile
/^[0-9]{4}-[0-9]{2}-[0-9]{2} /
- 在遇到以日期字符串 开头的行时
if(fn && !a[]++) close(fn)
- 为前一个 "date" 关闭前一个打开的文件描述符
fn="log-"
- 构造文件名
查看结果:
$ head log-*
==> log-2017-10-02 <==
2017-10-02 04:26:02,534 INFO XXXXXXXXXXXXXXXXX
==> log-2017-10-03 <==
2017-10-03 04:26:02,543 INFO XXXXXXXXXXXX
==> log-2017-10-04 <==
2017-10-04 04:26:02,544 INFO XXXXXXXXX
2017-10-04 04:26:02,546 INFO XXXXXXXXXXXXX
2017-10-04 04:26:02,549 INFO XXXXXXXXXXX
2017-10-04 04:53:02,787 WARN class.class.class: [FetcherXXXXXX], Error in fetch XXXXXXXXXXXXXXXXXXXXXX
java.io.IOException: Connection to X was disconnected before the response was read
&XXXXXXXXXXXXXXXXXXXX
&XXXXXXXXXXXXXXXXXXXX
&XXXXXXXXXXXXXXXXXXXXX
&XXXXXXXXXXXXXXXX
&XXXXXXXXXXXXXXXX
==> log-2017-10-05 <==
2017-10-05 04:26:02,549 INFO XXXXXXXXXXX