正则表达式处理 rubular fluentd 中的所有多行异常
Regex to handle all Multiline exception in rubular fluentd
我将正则表达式设计为匹配以下 rubular 格式的 fluentd 解析器的所有多行异常或警告消息字段
(SLF4J:\s.*|[a-zA-z_]*\..*\.*\s.*\s.*|Caused\sby:\s|\s+at\s.*|\s+\.\.\. (\d)+ more)
It matches unnecessary fields.
我想匹配所有异常或警告多行的开始。
简而言之:最新的多行将从文件的开头读取,直到它得到下一行,因为 JSON.JSON 总是以 {" 开头。当我们看到行开始时使用 {" 我们将停止阅读 multiline
one regex for both the cases or 2 regex for both the cases is fine
演示link
正则表达式可用于:https://rubular.com/r/O26Wm6mc7z51re
正则表达式可用于:https://rubular.com/r/v6Q7iwZqmNDAAx
测试字符串是:
java.lang.InterruptedException: Timeout while waiting for epoch from quorum
at org.apache.zookeeper.server.quorum.Leader.getEpochToPropose(Leader.java:1227)
at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:482)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1284)
... 19 more
{"log_timestamp": "2021-02-18T11:33:23.114+0000", "log_level": "WARN", "process_id": "zookeeper#2", "process_name": "zookeeper", "thread_id": 1, "thread_name": "QuorumPeer[myid=2](plain=/0.0.0.0:2181)(secure=disabled)", "action_name": "org.apache.zookeeper.server.quorum.QuorumPeer", "log_message": "PeerState set to LOOKING"}
{"log_timestamp": "2021-02-18T11:33:23.115+0000", "log_level": "WARN", "process_id": "zookeeper#2", "process_name": "zookeeper", "thread_id": 1, "thread_name": "WorkerSender[myid=2]", "action_name": "org.apache.zookeeper.server.quorum.QuorumPeer", "log_message": "Failed to resolve address: zk-2.zk-headless.intam.svc.cluster.local"}
java.net.UnknownHostException: zk-2.zk-headless.intam.svc.cluster.local
at java.net.InetAddress.getAllByName0(InetAddress.java:1281)
at java.net.InetAddress.getAllByName(InetAddress.java:1193)
at java.net.InetAddress.getAllByName(InetAddress.java:1127)
at java.net.InetAddress.getByName(InetAddress.java:1077)
at org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer.recreateSocketAddresses(QuorumPeer.java:194)
at org.apache.zookeeper.server.quorum.QuorumPeer.recreateSocketAddresses(QuorumPeer.java:764)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:699)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:618)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:477)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:456)
at java.lang.Thread.run(Thread.java:748)
{"log_timestamp": "2021-02-18T11:33:23.115+0000", "log_level": "WARN", "process_id": "zookeeper#2", "process_name": "zookeeper", "thread_id": 1, "thread_name": "WorkerSender[myid=2]", "action_name": "org.apache.zookeeper.server.quorum.QuorumPeer", "log_message": "Failed to resolve address: zk-2.zk-headless.sxc.svc.cluster.local"}
预期匹配:
对于演示 1:https://rubular.com/r/O26Wm6mc7z51re
java.lang.InterruptedException: Timeout while waiting for epoch from quorum
at org.apache.zookeeper.server.quorum.Leader.getEpochToPropose(Leader.java:1227)
at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:482)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1284)
... 19 more
对于演示2:https://rubular.com/r/v6Q7iwZqmNDAAx
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/spark/jars/logback-classic-1.2.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/spark/jars/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type
您可能会使用带有捕获组和反向引用的单一模式来获得这两个部分
^(SLF4J:|java\.lang\.InterruptedException:).*(?:\R(?!|{).*)*
模式匹配:
^
字符串开头
(SLF4J:|java\.lang\.InterruptedException).*
在第 1 组中捕获匹配任一备选方案
(?:
非捕获组
\R(?!|{).*
匹配一个换行符并断言该字符串不是以 wat 开头的 group 1 或 {
)*
关闭组并可选择重复以匹配所有行
查看 first part and the second part 的规则匹配。
注意在Java中加倍反斜杠
String regex = "^(SLF4J:|java\.lang\.InterruptedException:).*(?:\R(?!\1|\{).*)*";
不跨越 SLF4J 或不同类型的异常,在字符串的开头表示为点分隔字符串:
^(?:SLF4J:|\w+(?:\.\w+)+).*(?:\R(?!(?:SLF4J:|\w+(?:\.\w+)+)|{).*)*
我将正则表达式设计为匹配以下 rubular 格式的 fluentd 解析器的所有多行异常或警告消息字段
(SLF4J:\s.*|[a-zA-z_]*\..*\.*\s.*\s.*|Caused\sby:\s|\s+at\s.*|\s+\.\.\. (\d)+ more)
It matches unnecessary fields.
我想匹配所有异常或警告多行的开始。 简而言之:最新的多行将从文件的开头读取,直到它得到下一行,因为 JSON.JSON 总是以 {" 开头。当我们看到行开始时使用 {" 我们将停止阅读 multiline
one regex for both the cases or 2 regex for both the cases is fine
演示link
正则表达式可用于:https://rubular.com/r/O26Wm6mc7z51re
正则表达式可用于:https://rubular.com/r/v6Q7iwZqmNDAAx
测试字符串是:
java.lang.InterruptedException: Timeout while waiting for epoch from quorum
at org.apache.zookeeper.server.quorum.Leader.getEpochToPropose(Leader.java:1227)
at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:482)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1284)
... 19 more
{"log_timestamp": "2021-02-18T11:33:23.114+0000", "log_level": "WARN", "process_id": "zookeeper#2", "process_name": "zookeeper", "thread_id": 1, "thread_name": "QuorumPeer[myid=2](plain=/0.0.0.0:2181)(secure=disabled)", "action_name": "org.apache.zookeeper.server.quorum.QuorumPeer", "log_message": "PeerState set to LOOKING"}
{"log_timestamp": "2021-02-18T11:33:23.115+0000", "log_level": "WARN", "process_id": "zookeeper#2", "process_name": "zookeeper", "thread_id": 1, "thread_name": "WorkerSender[myid=2]", "action_name": "org.apache.zookeeper.server.quorum.QuorumPeer", "log_message": "Failed to resolve address: zk-2.zk-headless.intam.svc.cluster.local"}
java.net.UnknownHostException: zk-2.zk-headless.intam.svc.cluster.local
at java.net.InetAddress.getAllByName0(InetAddress.java:1281)
at java.net.InetAddress.getAllByName(InetAddress.java:1193)
at java.net.InetAddress.getAllByName(InetAddress.java:1127)
at java.net.InetAddress.getByName(InetAddress.java:1077)
at org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer.recreateSocketAddresses(QuorumPeer.java:194)
at org.apache.zookeeper.server.quorum.QuorumPeer.recreateSocketAddresses(QuorumPeer.java:764)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:699)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:618)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:477)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:456)
at java.lang.Thread.run(Thread.java:748)
{"log_timestamp": "2021-02-18T11:33:23.115+0000", "log_level": "WARN", "process_id": "zookeeper#2", "process_name": "zookeeper", "thread_id": 1, "thread_name": "WorkerSender[myid=2]", "action_name": "org.apache.zookeeper.server.quorum.QuorumPeer", "log_message": "Failed to resolve address: zk-2.zk-headless.sxc.svc.cluster.local"}
预期匹配: 对于演示 1:https://rubular.com/r/O26Wm6mc7z51re
java.lang.InterruptedException: Timeout while waiting for epoch from quorum
at org.apache.zookeeper.server.quorum.Leader.getEpochToPropose(Leader.java:1227)
at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:482)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1284)
... 19 more
对于演示2:https://rubular.com/r/v6Q7iwZqmNDAAx
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/spark/jars/logback-classic-1.2.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/spark/jars/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type
您可能会使用带有捕获组和反向引用的单一模式来获得这两个部分
^(SLF4J:|java\.lang\.InterruptedException:).*(?:\R(?!|{).*)*
模式匹配:
^
字符串开头(SLF4J:|java\.lang\.InterruptedException).*
在第 1 组中捕获匹配任一备选方案(?:
非捕获组\R(?!|{).*
匹配一个换行符并断言该字符串不是以 wat 开头的 group 1 或{
)*
关闭组并可选择重复以匹配所有行
查看 first part and the second part 的规则匹配。
注意在Java中加倍反斜杠
String regex = "^(SLF4J:|java\.lang\.InterruptedException:).*(?:\R(?!\1|\{).*)*";
不跨越 SLF4J 或不同类型的异常,在字符串的开头表示为点分隔字符串:
^(?:SLF4J:|\w+(?:\.\w+)+).*(?:\R(?!(?:SLF4J:|\w+(?:\.\w+)+)|{).*)*