如何忽略 jq 中断开的 JSON 行?

How to ignore broken JSON line in jq?

当使用jq处理日志文件时,某些行可能被破坏,因此jq抛出错误并停止处理。

例如完整日志:

{"level":"debug","time":"2021-09-24T19:42:47.140+0800","message":"sent send binary to ws server1","pid":41491,"cid":"32likw","num":1,"count":5120}
{"level":"debug","time":"2021-09-24T19:42:47.305+0800","message":"sent send binary to ws server2","pid":41491,"cid":"32likw","num":1,"count":5120}
{"level":"debug","time":"2021-09-24T19:42:47.469+0800","message":"sent send binary to ws server3","pid":41491,"cid":"32likw","num":1,"count":5120}
{"level":"debug","time":"2021-09-24T19:42:47.499+0800","message":"sent send binary to ws server4","pid":41491,"cid":"32likw","num":1,"count":5120}
{"level":"debug","time":"2021-09-24T19:42:47.581+0800","message":"sent send binary to ws server5","pid":41491,"cid":"32likw","num":1,"count":5120}

jq处理得很好:

< snippet1.json jq -C -r '.message'
sent send binary to ws server1
sent send binary to ws server2
sent send binary to ws server3
sent send binary to ws server4
sent send binary to ws server5

损坏的(第 3 行的最后一部分丢失):

{"level":"debug","time":"2021-09-24T19:42:47.140+0800","message":"sent send binary to ws server1","pid":41491,"cid":"32likw","num":1,"count":5120}
{"level":"debug","time":"2021-09-24T19:42:47.305+0800","message":"sent send binary to ws server2","pid":41491,"cid":"32likw","num":1,"count":5120}
{"level":"debug","time":"2021-09-24T19:42:47.469+0800","message":"sent send binary to ws server3","pi
{"level":"debug","time":"2021-09-24T19:42:47.499+0800","message":"sent send binary to ws server4","pid":41491,"cid":"32likw","num":1,"count":5120}
{"level":"debug","time":"2021-09-24T19:42:47.581+0800","message":"sent send binary to ws server5","pid":41491,"cid":"32likw","num":1,"count":5120}

jq 停在虚线处:

< snippet2.json jq -C -r '.message'
sent send binary to ws server1
sent send binary to ws server2
parse error: Invalid string: control characters from U+0000 through U+001F must be escaped at line 4, column 2

而且我希望jq可以忽略第3行并继续,就像这样:

< snippet2.json jq -C -r '.message'
sent send binary to ws server1
sent send binary to ws server2
sent send binary to ws server4
sent send binary to ws server5

我尝试使用 中提到的 -R,但对这种情况没有帮助。

< snippet2.json jq -C -R -r '.message'
jq: error (at <stdin>:1): Cannot index string with string "message"
jq: error (at <stdin>:2): Cannot index string with string "message"
jq: error (at <stdin>:3): Cannot index string with string "message"
jq: error (at <stdin>:4): Cannot index string with string "message"
jq: error (at <stdin>:5): Cannot index string with string "message"

如果有这样的 solutions/skills 到 ignore/skip/suppress 错误,您能否告诉我,并得到其余的结果?

要跳过虚线,您可以使用:

jq -Rr 'fromjson? | .message'

如果你想用它们做些别的事情,你可以从这样的事情开始:

jq -R '. as $line | try fromjson catch $line'

有关其他选项,请参阅:

: Is there a way to have jq keep going after it hits an error in the input file? Can jq handle broken JSON?

jq FAQ.

对peak的回答再做一些解释。 (学分达到顶峰)

解决方案 #1:

❯ cat bad.json | jq -r -R 'fromjson? | .message'
sent send binary to ws server1
sent send binary to ws server2
sent send binary to ws server4
sent send binary to ws server5

解决方案 #2:

❯ cat bad.json | jq -r -R '. as $line | try fromjson catch $line | .message'
sent send binary to ws server1
sent send binary to ws server2
jq: error (at <stdin>:3): Cannot index string with string "message"
sent send binary to ws server4
sent send binary to ws server5

jq 仍然输出错误,但它在 stderr 上,您可以重定向它:

❯ cat bad.json | jq -r -R '. as $line | try fromjson catch $line | .message' 2>/dev/null
sent send binary to ws server1
sent send binary to ws server2
sent send binary to ws server4
sent send binary to ws server5

值得注意的是-R-r可以一起使用。 (感谢@peak!)