Python 从 Elasticsearch 获取数据

Python get data from Elasticsearch

我有一个 json modsecurity nginx 日志。我已将其发送到 Elasticsearch。 现在我想写一个 python 脚本来从 Elasticsearch 获取数据并用它来触发 Zabbix 监视器。

但我对此感到困惑。这是我将数据传输到 Elasticsearch

时的数据
curl -X GET "localhost:9200/modsecurity_*/_search?size=1&pretty"

回应

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
  "total" : 1,
  "successful" : 1,
  "skipped" : 0,
  "failed" : 0
  },
    "hits" : {
    "total" : {
    "value" : 6850,
    "relation" : "eq"
  },
"max_score" : 1.0,
"hits" : [
  {
    "_index" : "modsecurity_20200316",
    "_type" : "modsecurity",
    "_id" : "A-1n4nABQLLqq2S26hS0",
    "_score" : 1.0,
    "_source" : {
      "client_ip" : "127.0.0.1",
      "producer" : {
        "connector" : "ModSecurity-nginx v1.0.1",
        "components" : [
          "OWASP_CRS/3.0.2\""
        ],
        "modsecurity" : "ModSecurity v3.0.4 (Linux)",
        "secrules_engine" : "Enabled"
      },
      "host_port" : 80,
      "request" : {
        "body" : "<!--#include virtual=\"/index.jsp\"-->",
        "http_version" : 1.1,
        "headers" : {
          "content-length" : "36",
          "host" : "localhost",
          "user-agent" : "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36",
          "connection" : "Keep-Alive",
          "content-type" : "application/x-www-form-urlencoded"
        },
        "method" : "GET",
        "uri" : "/Excel/"
      },
      "server_id" : "c46580787c35fc368143d376c8f037e2e63514e4",
      "host_ip" : "127.0.0.1",
      "client_port" : 48100,
      "unixts" : 1584346386925,
      "msg" : {
        "severity" : [
          "2",
          "2",
          "2"
        ],
        "tags" : [
          "application-multi",
          "language-multi",
          "platform-multi",
          "attack-protocol",
          "OWASP_CRS/PROTOCOL_VIOLATION/INVALID_HREQ",
          "CAPEC-272",
          "attack-xss",
          "OWASP_CRS/WEB_ATTACK/XSS",
          "WASCTC/WASC-8",
          "WASCTC/WASC-22",
          "OWASP_TOP_10/A3",
          "OWASP_AppSensor/IE1",
          "CAPEC-242",
          "attack-generic"
        ],
        "ruleid" : [
          "920170",
          "941180",
          "949110"
        ],
        "file" : [
          "/usr/local/owasp-modsecurity-crs-3.0.2/rules/REQUEST-920-PROTOCOL-ENFORCEMENT.conf",
          "/usr/local/owasp-modsecurity-crs-3.0.2/rules/REQUEST-941-APPLICATION-ATTACK-XSS.conf",
          "/usr/local/owasp-modsecurity-crs-3.0.2/rules/REQUEST-949-BLOCKING-EVALUATION.conf"
        ],
        "linenumber" : [
          "242",
          "276",
          "44"
        ],
        "message" : [
          "GET or HEAD Request with Body Content.",
          "Node-Validator Blacklist Keywords",
          "Inbound Anomaly Score Exceeded (Total Score: 15)"
        ],
        "data" : [
          "GET",
          "Matched Data: --> found within ARGS:<!--#include virtual: \"/index.jsp\"-->",
          ""
        ],
        "match" : [
          "Matched \"Operator `Rx' with parameter `^0?$' against variable `REQUEST_HEADERS:Content-Length' (Value: `36' )",
          "Matched \"Operator `Pm' with parameter `document.cookie document.write .parentnode .innerhtml window.location -moz-binding <!-- --> <![cdata[' against variable `ARGS:<!--#include virtual' (Value: `\"/index.jsp\"-->' )",
          "Matched \"Operator `Ge' with parameter `5' against variable `TX:ANOMALY_SCORE' (Value: `15' )"
        ]
      },
      "time_stamp" : "Mon Mar 16 15:13:06 2020",
      "response" : {
        "body" : "<html>\r\n<head><title>403 Forbidden</title></head>\r\n<body>\r\n<center><h1>403 Forbidden</h1></center>\r\n<hr><center>nginx/1.17.8</center>\r\n</body>\r\n</html>\r\n<!-- a padding to disable MSIE and Chrome friendly error page -->\r\n<!-- a padding to disable MSIE and Chrome friendly error page -->\r\n<!-- a padding to disable MSIE and Chrome friendly error page -->\r\n<!-- a padding to disable MSIE and Chrome friendly error page -->\r\n<!-- a padding to disable MSIE and Chrome friendly error page -->\r\n<!-- a padding to disable MSIE and Chrome friendly error page -->\r\n",
        "headers" : {
          "date" : "Mon, 16 Mar 2020 08:13:06 GMT",
          "content-length" : "555",
          "content-type" : "text/html",
          "connection" : "keep-alive",
          "server" : "nginx/1.17.8"
        },
        "http_code" : 403
      },
      "unique_id" : "158434638692.527282"
       }
     }
   ]
 }
}

我要获取标签的这个内容

"tags" : [
      "application-multi",
      "language-multi",
      "platform-multi",
      "attack-protocol",
      "OWASP_CRS/PROTOCOL_VIOLATION/INVALID_HREQ",
      "CAPEC-272",
      "attack-xss",
      "OWASP_CRS/WEB_ATTACK/XSS",
      "WASCTC/WASC-8",
      "WASCTC/WASC-22",
      "OWASP_TOP_10/A3",
      "OWASP_AppSensor/IE1",
      "CAPEC-242",
      "attack-generic"
    ],

Python 中使用 json 模块的简单方法

import json

如果你的结果 return 键入 Dict

print(json.loads(json.dumps(YOUR_RESULT))["hits"]["hits"][0]["_source"]["msg"]["tags"])

如果你的结果 return as String

print(json.loads(YOUR_RESULT)["hits"]["hits"][0]["_source"]["msg"]["tags"])

它会在第一个结果中得到标签

在 Elasticsearch 中仅从搜索查询中获取标签的简单方法

curl -XGET "http://localhost:9200/modsecurity_*/_search" -H 'Content-Type: application/json' -d'{  "size": 1,   "_source": ["msg.tags"],  "query": {"match_all": {}}}'

可能会有帮助(^^)