Bash:将输出读取到具有特殊字符的字符串

Bash: Reading Output to a String with Special Characters

我正在使用 TShark 将 PCAP 的 TCP 流读取到设置格式的文件中。我的代码:

#!/bin/bash
OUT="*/temp/Temp.txt"
NEW="\"REQ:"
i=0
echo "Generating conversations..."
echo ""  > $OUT
while [ "$COUNT" != 1 ]
do
    BLOCK="$(tshark -r */browser.pcap -q -z follow,tcp,ascii,$i)"
    SUB=$(echo "$BLOCK" | sed -n '5p')
    PORT=${SUB##*:}
    BLOCK="${BLOCK//$'\t'/\"RES:}"
    BLOCK=$(echo "$BLOCK" | tail -n +6)
    BLOCK=$(echo "$BLOCK" | head -n -1)
    COUNT=$(echo "$BLOCK" | wc -l)
    BLOCK=$(echo "$BLOCK" | awk '{print $j"\""}')
    j=1
    while [ $j -lt $(($COUNT+2)) ]
    do
        CHECK=$(echo "$BLOCK" | sed $j'q;d')
        PREF=${CHECK:0:5}
        if [ "$PREF" != "\"RES:" ]; then
            CHECK=$NEW$CHECK
            BLOCK=$(echo "$BLOCK" | sed $j's/.*/'$CHECK'/')
        fi
        j=$(($j+1))
    done
    if [ "$COUNT" != 1 ]; then
        echo ""  >> $OUT
        echo "$" >> $OUT
        echo "tag = \"gen."$i"\"" >> $OUT
        echo "port = \""$PORT"\"" >> $OUT
        echo "base = \"TCP\"" >> $OUT
        echo "payloads:" >> $OUT
        echo "$BLOCK" >> $OUT
        echo "Generated conversation "$i
    fi
    i=$(($i+1))
done
echo "Generation complete!"

当我 运行 这样做时,每次读取对话时都会收到以下错误:

> sed: -e expression #1, char 18: unterminated `s' command

我认为问题在于第 9 行对 TShark 的调用。最初我使用 "raw" 命令参数,它输出原始十六进制数据。这有效并正确输出。但是,我的任务需要输出 ASCII 数据。将 "raw" 更改为 "ascii" (均被 TShark 识别)会导致上述错误。我相信这是因为读取数据包中的 ASCII 数据包含特殊字符;命令行中第 9 行生成的一小段数据是:

..7.<.......Y.|.$.......2...W...v.'#

我的问题是我正在解析的 ASCII 数据中的特殊字符是否导致了 sed 错误?如果是这样,我怎么能让 bash 忽略它们?谢谢!

编辑-我最终试图获得这个 TShark 命令的输出,它看起来像这样...

===================================================================
Follow: tcp,raw
Filter: tcp.stream eq 4
Node 0: 10.211.55.3:58733
Node 1: 157.127.239.146:80
47455420687474703a2f2f73656d696e617270726f6a656374732e6f72672f6373732e7068703f7374796c6573686565743d393620485454502f312e310d0a486f73743a2073656d696e617270726f6a656374732e6f72670d0a557365722d4167656e743a204d6f7a696c6c612f352e3020285831313b204c696e7578207838365f36343b2072763a33382e3029204765636b6f2f32303130303130312046697265666f782f33382e300d0a4163636570743a20746578742f6373732c2a2f2a3b713d302e310d0a4163636570742d4c616e67756167653a20656e2d55532c656e3b713d302e350d0a4163636570742d456e636f64696e673a20677a69702c206465666c6174650d0a526566657265723a20687474703a2f2f73656d696e617270726f6a656374732e6f72672f632f74736861726b2d666f6c6c6f772d7463702d73747265616d0d0a436f6f6b69653a205f5f6366647569643d646564613432383039663566623634356461663239333963366235336565653764313433373734383236323b206d7962625b6c61737476697369745d3d313433373734383333353b206d7962625b6c6173746163746976655d3d313433373734383333353b207369643d31663739303463373761383761656234363537306131636161316462336161310d0a436f6e6e656374696f6e3a206b6565702d616c6976650d0a0d0a
    485454502f312e3120323030204f4b0d0a446174653a204672692c203234204a756c20323031352031343a33313a303420474d540d0a436f6e74656e742d547970653a20746578742f6373730d0a582d506f77657265642d42793a205048502f352e342e31360d0a5365727665723a20636c6f7564666c6172652d6e67696e780d0a43462d5241593a20323062303533396434326436313365332d4c41580d0a436f6e74656e742d456e636f64696e673a20677a69700d0a436f6e74656e742d4c656e6774683a203134320d0a4167653a20300d0a5669613a20312e31206e657070737730390d0a0d0a1f8b08000000000000036c8cbd0a03211084ebf52916ac13f2db689bcb6b04bd15919caeac060e42de3d981469325f37df305bcf4ee896436b2e067c2af06ebe47e14721837aba0eac8299171683faf88955e05928c8a6733578a82b365e12a1be9c063fefb977ceff27d511a5120d9eeb6a1564273195efe37e37aa970278030000ffff0300cc348afaa1000000
47455420687474703a2f2f7777772e676f6f676c652d616e616c79746963732e636f6d2f616e616c79746963732e6a7320485454502f312e310d0a486f73743a207777772e676f6f676c652d616e616c79746963732e636f6d0d0a557365722d4167656e743a204d6f7a696c6c612f352e3020285831313b204c696e7578207838365f36343b2072763a33382e3029204765636b6f2f32303130303130312046697265666f782f33382e300d0a4163636570743a202a2f2a0d0a4163636570742d4c616e67756167653a20656e2d55532c656e3b713d302e350d0a4163636570742d456e636f64696e673a20677a69702c206465666c6174650d0a526566657265723a20687474703a2f2f73656d696e617270726f6a656374732e6f72672f632f74736861726b2d666f6c6c6f772d7463702d73747265616d0d0a436f6e6e656374696f6e3a206b6565702d616c6976650d0a49662d4d6f6469666965642d53696e63653a205468752c203039204a756c20323031352032333a35303a353620474d540d0a0d0a
    485454502f312e3120333034204e6f74204d6f6469666965640d0a446174653a204672692c203234204a756c20323031352031343a33303a353520474d540d0a457870697265733a204672692c203234204a756c20323031352031353a35313a343120474d540d0a43616368652d436f6e74726f6c3a207075626c69632c206d61782d6167653d373230300d0a566172793a204163636570742d456e636f64696e670d0a436f6e6e656374696f6e3a20636c6f73650d0a5669613a20312e31206e657070737730390d0a0d0a
===================================================================

...转换为程序读取的自定义格式。上面的输出是工作的原始十六进制数据格式。相应对话的自定义格式如下所示:

$
tag = "gen.4"
port = "58733"
base = "TCP"
payloads:
"REQ:47455420687474703a2f2f73656d696e617270726f6a656374732e6f72672f6373732e7068703f7374796c6573686565743d393620485454502f312e310d0a486f73743a2073656d696e617270726f6a656374732e6f72670d0a557365722d4167656e743a204d6f7a696c6c612f352e3020285831313b204c696e7578207838365f36343b2072763a33382e3029204765636b6f2f32303130303130312046697265666f782f33382e300d0a4163636570743a20746578742f6373732c2a2f2a3b713d302e310d0a4163636570742d4c616e67756167653a20656e2d55532c656e3b713d302e350d0a4163636570742d456e636f64696e673a20677a69702c206465666c6174650d0a526566657265723a20687474703a2f2f73656d696e617270726f6a656374732e6f72672f632f74736861726b2d666f6c6c6f772d7463702d73747265616d0d0a436f6f6b69653a205f5f6366647569643d646564613432383039663566623634356461663239333963366235336565653764313433373734383236323b206d7962625b6c61737476697369745d3d313433373734383333353b206d7962625b6c6173746163746976655d3d313433373734383333353b207369643d31663739303463373761383761656234363537306131636161316462336161310d0a436f6e6e656374696f6e3a206b6565702d616c6976650d0a0d0a"
"RES:485454502f312e3120323030204f4b0d0a446174653a204672692c203234204a756c20323031352031343a33313a303420474d540d0a436f6e74656e742d547970653a20746578742f6373730d0a582d506f77657265642d42793a205048502f352e342e31360d0a5365727665723a20636c6f7564666c6172652d6e67696e780d0a43462d5241593a20323062303533396434326436313365332d4c41580d0a436f6e74656e742d456e636f64696e673a20677a69700d0a436f6e74656e742d4c656e6774683a203134320d0a4167653a20300d0a5669613a20312e31206e657070737730390d0a0d0a1f8b08000000000000036c8cbd0a03211084ebf52916ac13f2db689bcb6b04bd15919caeac060e42de3d981469325f37df305bcf4ee896436b2e067c2af06ebe47e14721837aba0eac8299171683faf88955e05928c8a6733578a82b365e12a1be9c063fefb977ceff27d511a5120d9eeb6a1564273195efe37e37aa970278030000ffff0300cc348afaa1000000"
"REQ:47455420687474703a2f2f7777772e676f6f676c652d616e616c79746963732e636f6d2f616e616c79746963732e6a7320485454502f312e310d0a486f73743a207777772e676f6f676c652d616e616c79746963732e636f6d0d0a557365722d4167656e743a204d6f7a696c6c612f352e3020285831313b204c696e7578207838365f36343b2072763a33382e3029204765636b6f2f32303130303130312046697265666f782f33382e300d0a4163636570743a202a2f2a0d0a4163636570742d4c616e67756167653a20656e2d55532c656e3b713d302e350d0a4163636570742d456e636f64696e673a20677a69702c206465666c6174650d0a526566657265723a20687474703a2f2f73656d696e617270726f6a656374732e6f72672f632f74736861726b2d666f6c6c6f772d7463702d73747265616d0d0a436f6e6e656374696f6e3a206b6565702d616c6976650d0a49662d4d6f6469666965642d53696e63653a205468752c203039204a756c20323031352032333a35303a353620474d540d0a0d0a"
"RES:485454502f312e3120333034204e6f74204d6f6469666965640d0a446174653a204672692c203234204a756c20323031352031343a33303a353520474d540d0a457870697265733a204672692c203234204a756c20323031352031353a35313a343120474d540d0a43616368652d436f6e74726f6c3a207075626c69632c206d61782d6167653d373230300d0a566172793a204163636570742d456e636f64696e670d0a436f6e6e656374696f6e3a20636c6f73650d0a5669613a20312e31206e657070737730390d0a0d0a"

您可以通过引用变量扩展来告诉 bash 不解释元字符:

sed $j's/.*/'"$CHECK"'/'

事实上,上面没有理由使用单引号,所以你可以将整个命令参数双引号:

sed "${j}s/.*/$CHECK/"

然而,以上都不会告诉sed避免在s命令的替换部分解释特殊字符,所以如果$CHECK包含a /,那么这将提前终止替换。

所以真正的问题是,有没有更好的方法来完成这个:

BLOCK=$(echo "$BLOCK" | sed $j's/.*/'$CHECK'/')

显然,目标是用 $CHECK 的值替换 $BLOCK 值的第 $j 行。一种方法是使用 awk:

BLOCK="$(awk -v repl="$CHECK" 'NR==$j{print repl;next}1')"

备注:

  1. 虽然我没有在我的示例中修复它,但对 shell 变量使用全部大写是非常糟糕的风格。通常,全部大写的 shell 变量保留为 bash 或系统实用程序(例如 $PATH$IFS$TERM;等)用作已知导出变量.).你自己的变量应该小写以避免冲突。

  2. 从中摘录命令的完整循环可能会在 awk 中更有效、更干净(也更容易理解)地实现。根据示例输出,以下可能有效:

    echo "Generating conversations..."
    i=0
    while 
        tshark -r */browser.pcap -q -z follow,tcp,ascii,$i |
        awk -v idx=$i -v '
          NR==4 { n = split([=14=], a, /:/); port = a[n]; }
          NR<6  { next; }
          /^=========/ { exit port != 0; }
          port  { print "$"
                  printf "tag = \"gen.%d\"" idx
                  print "port = \"%s\"" port
                  print "base = \"TCP\""
                  print "payloads:"
                  port = 0
                }
          /^\t/ { printf "\"RES:%s\"" substr([=14=], 2) "\""; next; }
                { printf "\"REQ:%s\"" [=14=] "\""; }
        ' >> $OUT;
    do
        echo "Generated conversation "$i
    done
    echo "Generation complete!"
    

    我没试过。它很可能是越野车。我不明白终止条件,所以我只是猜测。我不确定您是否真的想从第 5 行(如代码中)或第 4 行(如示例中)提取端口号。