将 geoIP 数据添加到 Elasticsearch 索引中的旧数据

Add geoIP data to old data from Elasticsearch index

我最近在 Elasticsearch 中的摄取管道中添加了一个 GeoIP 处理器。这很好用,并为新摄取的文档添加了新字段。 我想通过在索引上执行 _update_by_query 将 GeoIP 字段添加到旧数据,但是,它似乎不接受“处理器”作为参数。

我想做的是这样的:

POST my_index*/_update_by_query
{
 "refresh": true,
 "processors": [
   {
     "geoip" : {
        "field": "doc['client_ip']",
        "target_field" : "geo",
        "database_file" : "GeoLite2-City.mmdb",
        "properties":["continent_name", "country_iso_code", "country_name", "city_name", "timezone", "location"]
    }
   }
 ],
 "script": {
  "day_of_week": {
    "type": "long",
    "script": "emit(doc['@timestamp'].value.withZoneSameInstant(ZoneId.of(doc['geo.timezone'])).getDayOfWeek().getValue())"
  },
  "hour_of_day": {
    "type": "long",
    "script": "emit(doc['@timestamp'].value.withZoneSameInstant(ZoneId.of(doc['geo.timezone'])).getHour())"
  },
  "office_hours": {
    "script": "if (doc['day_of_week'].value< 6 && doc['day_of_week'].value > 0) {if (doc['hour_of_day'].value> 7 && doc['hour_of_day'].value<19) {return 1;} else {return -1;} } else {return -1;}"
  }
 }
}

我收到以下错误:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "parse_exception",
        "reason" : "Expected one of [source] or [id] fields, but found none"
      }
    ],
    "type" : "parse_exception",
    "reason" : "Expected one of [source] or [id] fields, but found none"
  },
  "status" : 400
}

由于您已准备好摄取管道,您只需在调用 _update_by_query 端点时引用它,如下所示:

POST my_index*/_update_by_query?pipeline=my-pipeline
                                    ^
                                    |
                                 add this