如何从弹性搜索中的时间戳获取本地星期几

How to get the local day of week from timestamp in elasticsearch

我正在使用摄取管道脚本处理器从每个文档的当地时间中提取星期几。

我正在使用 client_ip 提取时区,将其与时间戳一起使用以提取本地时间,然后从该本地时间提取星期几(和其他特征)。

这是我的摄取管道:

{
    "processors" : [
      {
        "set" : {
          "field" : "@timestamp",
          "override" : false,
          "value" : "{{_ingest.timestamp}}"
        }
      },
      {
        "date" : {
          "field" : "@timestamp",
          "formats" : [
            "EEE MMM dd HH:mm:ss 'UTC' yyyy"
          ],
          "ignore_failure" : true,
          "target_field" : "@timestamp"
        }
      },
      {
        "convert" : {
          "field" : "client_ip",
          "type" : "ip",
          "ignore_failure" : true,
          "ignore_missing" : true
        }
      },
      {
        "geoip" : {
          "field" : "client_ip",
          "target_field" : "client_geo",
          "properties" : [
            "continent_name",
            "country_name",
            "country_iso_code",
            "region_iso_code",
            "region_name",
            "city_name",
            "location",
            "timezone"
          ],
          "ignore_failure" : true,
          "ignore_missing" : true
        }
      },
      {
        "script" : {
          "description" : "Extract details of Dates",
          "lang" : "painless",
          "ignore_failure" : true,
          "source" : """
            LocalDateTime local_time LocalDateTime.ofInstant( Instant.ofEpochMilli(ctx['@timestamp']), ZoneId.of(ctx['client_geo.timezone']));
            int day_of_week = local_time.getDayOfWeek().getValue();
            int hour_of_day = local_time.getHour();
            int office_hours = 0;
            if (day_of_week<6 && day_of_week>0) { if (hour_of_day >= 7 && hour_of_day <= 19 ) {office_hours =1;}  else {office_hours = -1;}} else {office_hours = -1;}
            ctx['day_of_week'] = day_of_week;
            ctx['hour_of_day'] = hour_of_day;
            ctx['office_hours'] = office_hours;
          """
        }
      }
    ]
}

之前添加的前两个处理器用于其他目的。我添加了最后 3 个。

示例文档如下:

  "docs": [
    {
      "_source": {
        "@timestamp": 43109942361111,
        "client_ip": "89.160.20.128"
      }
    }
  ]

我现在正在获取数据中的 GeoIP 字段,但是 none 由脚本处理器创建的字段。我做错了什么?

编辑 有关受这些更改影响的索引的一些注意事项: 动态映射关闭。 我已将 client_geo.timezone 字段作为关键字手动添加到索引的映射中。 当我 运行 在索引

上执行以下脚本搜索时
GET index_name/_search
{
 "script_fields": {
  "day_of_week": {
    "script": "doc['@timestamp'].value.withZoneSameInstant(ZoneId.of(doc['client_geo']['timezone'])).getDayOfWeek().getValue()"
  }
 }
}

我在执行脚本时遇到以下 运行时间错误:

          "caused_by" : {
            "type" : "illegal_argument_exception",
            "reason" : "No field found for [client_geo] in mapping"
          }

感谢您提供格式正确的问题和示例。

我能够重现你的问题并解决了它。

ctx 是“原样的文档来源”。因此,摄取不会自动挖掘以点分隔的字段。

您的客户数据是这样添加的:

"client_geo" : {
   "continent_name" : "Europe"
   //<snip>..</snip>
}

因此,您必须将其作为嵌套哈希映射直接访问。

意思是ctx['client_geo.timezone']实际上应该是ctx['client_geo']['timezone']

这是对我有用的完整管道:

"processors": [
      {
        "set": {
          "field": "@timestamp",
          "override": false,
          "value": "{{_ingest.timestamp}}"
        }
      },
      {
        "date": {
          "field": "@timestamp",
          "formats": [
            "EEE MMM dd HH:mm:ss 'UTC' yyyy"
          ],
          "ignore_failure": true,
          "target_field": "@timestamp"
        }
      },
      {
        "convert": {
          "field": "client_ip",
          "type": "ip",
          "ignore_failure": true,
          "ignore_missing": true
        }
      },
      {
        "geoip": {
          "field": "client_ip",
          "target_field": "client_geo",
          "properties": [
            "continent_name",
            "country_name",
            "country_iso_code",
            "region_iso_code",
            "region_name",
            "city_name",
            "location",
            "timezone"
          ],
          "ignore_failure": true,
          "ignore_missing": true
        }
      },
      {
        "script": {
          "description": "Extract details of Dates",
          "lang": "painless",
          "ignore_failure": true,
          "source": """
            LocalDateTime local_time = LocalDateTime.ofInstant(Instant.ofEpochMilli(ctx['@timestamp']), ZoneId.of(ctx['client_geo']['timezone']));
            int day_of_week = local_time.getDayOfWeek().getValue();
            int hour_of_day = local_time.getHour();
            int office_hours = 0;
            if (day_of_week<6 && day_of_week>0) { if (hour_of_day >= 7 && hour_of_day <= 19 ) {office_hours =1;}  else {office_hours = -1;}} else {office_hours = -1;}
            ctx['day_of_week'] = day_of_week;
            ctx['hour_of_day'] = hour_of_day;
            ctx['office_hours'] = office_hours;
          """
        }
      }
    ]