如何用elasticsearch查询这些数据

How to query this data with elasticseach

我想找出每个家庭中年龄最大的男性。结果中的每个人必须至少年满 18 岁。

数据如下: 数据为 csv

id     FamilyId     LastName         FirstName         Age     Gender
1      1            Smith            John              20      M
2      1            Smith            Joan              20      F
3.     1            Smith            Harry             1       M
4      2            Ross             Pie               33      F
5      2            Ross             Norman            30      M
6      2            Ross             Devan             13      M
7      2            Ross             Debra             9       F
8      2            Ross             Terry             9       F
9      3            Johnson          Mary              25      F
10     4            King             Bob               5       M

数据为json

[
  {
    "id":1,
    "FamilyId":1,
    "LastName":"Smith",
    "FirstName":"John",
    "Age":20,
    "Gender":"M"
  },
  {
    "id":2,
    "FamilyId":1,
    "LastName":"Smith",
    "FirstName":"Joan",
    "Age":20,
    "Gender":"F"
  },
  {
    "id":3,
    "FamilyId":1,
    "LastName":"Smith",
    "FirstName":"Harry",
    "Age":1,
    "Gender":"M"
  },
  {
    "id":4,
    "FamilyId":2,
    "LastName":"Ross",
    "FirstName":"Pie",
    "Age":33,
    "Gender":"F"
  },
  {
    "id":5,
    "FamilyId":2,
    "LastName":"Ross",
    "FirstName":"Norman",
    "Age":30,
    "Gender":"M"
  },
  {
    "id":6,
    "FamilyId":2,
    "LastName":"Ross",
    "FirstName":"Devan",
    "Age":13,
    "Gender":"M"
  },
  {
    "id":7,
    "FamilyId":2,
    "LastName":"Ross",
    "FirstName":"Debra",
    "Age":9,
    "Gender":"F"
  },
  {
    "id":8,
    "FamilyId":2,
    "LastName":"Ross",
    "FirstName":"Terry",
    "Age":9,
    "Gender":"F"
  },
  {
    "id":9,
    "FamilyId":3,
    "LastName":"Johnson",
    "FirstName":"Mary",
    "Age":25,
    "Gender":"F"
  },
  {
    "id":10,
    "FamilyId":4,
    "LastName":"King",
    "FirstName":"Bob",
    "Age":5,
    "Gender":"M"
  }
]

这是我期望的数据:

id     FamilyId     LastName         FirstName         Age     Gender
1      1            Smith            John              20      M
5      2            Ross             Norman            30      M

数据为json

[
  {
    "id":1,
    "FamilyId":1,
    "LastName":"Smith",
    "FirstName":"John",
    "Age":20,
    "Gender":"M"
  },
  {
    "id":5,
    "FamilyId":2,
    "LastName":"Ross",
    "FirstName":"Norman",
    "Age":30,
    "Gender":"M"
  }
]

如果它太难获得,我不需要结果中的 id 字段。使用 elasticsearch 可以进行这样的查询吗?

这个好像可以; filter agg and top hits agg 的组合(一定喜欢这个新品牌,是吧?):

POST /test_index/_search?search_type=count
{
   "aggs": {
      "males_18_and_over": {
         "filter": {
            "and": [
               { "term": { "Gender": "M" } },
               { "range": { "Age": { "gte": 18 } } } 
            ]
         },
         "aggs": {
            "last_names": {
               "terms": {
                  "field": "LastName"
               },
               "aggs": {
                  "max_age": {
                     "top_hits": {
                        "sort": [
                           {
                              "Age": {
                                 "order": "desc"
                              }
                           }
                        ],
                        "size": 1
                     }
                  }
               }
            }
         }
      }
   }
}

returns:

{
   "took": 4,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 10,
      "max_score": 0,
      "hits": []
   },
   "aggregations": {
      "males_18_and_over": {
         "doc_count": 2,
         "last_names": {
            "buckets": [
               {
                  "key": "Ross",
                  "doc_count": 1,
                  "max_age": {
                     "hits": {
                        "total": 1,
                        "max_score": null,
                        "hits": [
                           {
                              "_index": "test_index",
                              "_type": "doc",
                              "_id": "5",
                              "_score": null,
                              "_source": {
                                 "id": 5,
                                 "FamilyId": 2,
                                 "LastName": "Ross",
                                 "FirstName": "Norman",
                                 "Age": 30,
                                 "Gender": "M"
                              },
                              "sort": [
                                 30
                              ]
                           }
                        ]
                     }
                  }
               },
               {
                  "key": "Smith",
                  "doc_count": 1,
                  "max_age": {
                     "hits": {
                        "total": 1,
                        "max_score": null,
                        "hits": [
                           {
                              "_index": "test_index",
                              "_type": "doc",
                              "_id": "1",
                              "_score": null,
                              "_source": {
                                 "id": 1,
                                 "FamilyId": 1,
                                 "LastName": "Smith",
                                 "FirstName": "John",
                                 "Age": 20,
                                 "Gender": "M"
                              },
                              "sort": [
                                 20
                              ]
                           }
                        ]
                     }
                  }
               }
            ]
         }
      }
   }
}

这是我用来设置它的代码:

http://sense.qbox.io/gist/04742b9a9ce5b2b25a3829f0ffc719992ef20ad3