通过日志文件查找 cc 攻击 IP 使用 shell 脚本

Find cc attack IPs use shell scripts by log files

我有这样的历史 Web 日志文件:

157.15.14.19 - -  06 Sep 2016 09:13:10 +0300  "GET /index.php?id=1 HTTP/1.1" 200 16977 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.15.14.19 - -  06 Sep 2016 09:13:11 +0300  "GET /index.php?id=2 HTTP/1.1" 200 16977 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.15.14.19 - -  06 Sep 2016 09:13:12 +0300  "GET /index.php?id=3 HTTP/1.1" 200 16977 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.15.14.19 - -  06 Sep 2016 09:14:13 +0300  "GET /index.php?id=4 HTTP/1.1" 200 16977 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.15.14.19 - -  06 Sep 2016 09:14:14 +0300  "GET /index.php?id=5 HTTP/1.1" 200 16977 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.15.14.19 - -  06 Sep 2016 09:15:15 +0300  "GET /index.php?id=6 HTTP/1.1" 200 16977 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.15.14.19 - -  06 Sep 2016 09:15:16 +0300  "GET /index.php?id=7 HTTP/1.1" 200 16977 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.15.14.19 - -  06 Sep 2016 09:15:17 +0300  "GET /index.php?id=8 HTTP/1.1" 200 16977 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.15.14.19 - -  06 Sep 2016 09:16:10 +0300  "GET /index.php?id=9 HTTP/1.1" 200 16977 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.15.14.19 - -  06 Sep 2016 09:16:10 +0300  "GET /index.php?id=10 HTTP/1.1" 200 16977 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
8.8.8.8 - -  06 Sep 2016 09:17:10 +0300  "GET /index.php?id=11 HTTP/1.1" 200 16977 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
9.9.9.9 - -  06 Sep 2016 09:17:10 +0300  "GET /index.php?id=12 HTTP/1.1" 200 16977 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.15.14.19 - -  06 Sep 2016 09:18:10 +0300  "GET /index.php?id=13 HTTP/1.1" 200 16977 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.15.14.19 - -  06 Sep 2016 09:19:10 +0300  "GET /index.php?id=14 HTTP/1.1" 200 16977 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.15.14.19 - -  06 Sep 2016 09:19:10 +0300  "GET /index.php?id=15 HTTP/1.1" 200 16977 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.15.14.19 - -  06 Sep 2016 09:20:10 +0300  "GET /index.php?id=15 HTTP/1.1" 200 16977 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
123.123.123.123 - -  06 Sep 2016 09:21:10 +0300  "GET /index.php?id=15 HTTP/1.1" 200 16977 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
157.15.14.19 - -  06 Sep 2016 09:22:10 +0300  "GET /index.php?id=15 HTTP/1.1" 200 16977 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"

想查出cc攻击的IP,只能通过昨天的web日志文件

这个例子,我签了一个cc攻击:

每隔5分钟,同一个远程ip请求次数超过5次,ip会进行一次cc攻击并打印出来

日志文件是一整天,只使用 bash 脚本,就像 awk,cat,gawk,sed 等等..

请多多指教,万分感谢


更新:

我尝试编写测试脚本(每 2 分钟相同的请求计数>5)

yy@yy:/tmp/tb$ cat 5.txt |awk '{print ,}' |awk -F: '{print *60+int(/2),[=12=]}' |sort |uniq -c -f2 |awk '{if(>5){print [=12=]}}'
     10 546 09:13:10 157.15.14.19

但是,代码这么烂,会优化的

awk -v Interval=5 -v Trig=5 -F '[[:blank:]]*|:' '
        {
        # using format log
        #  157.15.14.19 - -  06 Sep 2016 09:13:10 +0300  "GET /index.php?id=1 HTTP/1.1" 200 16977 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
        #             2 3  4  5   6    7  8  9  10      11 ...

        ThisTime =  * 60 + 
        #if new cycle (so this line is not in the cycle)
        if ( ThisTime > ( LastTic + Interval ) ) {
          # check and print last cycle hit
          for( IP in IPCounts) if ( IPCounts[ IP] > Trig) print LastTime " " IP " : " IPCounts[ IP]

          # reset reference
          split( "", IPCounts)
          LastTime =  " "  " "  " "  ":" sprintf( "%2d", (  - (  % Interval) )) ":00"
          LastTic =  * 60 + (  - (  % Interval) )
          }
        # add this line to new cycle
        IPCounts[ ]++
        }

        END {
          # print last cycle
          for( IP in IPCounts) if ( IPCounts[ IP] > Trig) print LastTime " " IP " : " IPCounts[ IP]
          }
      ' YourFile


# for format of log
#  op.g.cc 124.145.36.121 - - [21/Nov/2016:03:38:02 +0800] ==> 172.11.0.238:80 "POST ...
#        2              3 4 5            6  7  8  9      10   11 ...  

# change:
#   by ,  by 
#  LastTime =  ":"  ":" sprintf( "%2d", (  - (  % Interval) )) ":00 +800]"
#  IPCounts[ ]++

注:

  • 在时间选择上快速而肮脏地工作(你提到每天 1 个日志)。如果需要更高的精度,请使用 mkftime 以使用真实的纪元时间参考
  • Trig为计数触发电平(5次)Interval为循环时间(5分钟)