使用 Nest 按多个字段分组

Group by multiple fields using Nest

鉴于我有以下数据:

| Date | Value |Count|
| 2021-01-01 | X | 1 |
| 2021-01-01 | X | 2 |
| 2021-01-01 | Y | 1 |
| 2021-02-02 | X | 1 |
| 2021-02-02 | X | 2 |
| 2021-02-02 | Y | 5 |

我想使用多个字段对这些数据进行分组。 (日期和值)。

   Example :  Data.GroupBy(x=> new { x.Date, x.Value });

预期结果:

| Date | Value | Count |
| 2021-01-01 | X | 3 |
| 2021-01-01 | Y | 1 |
| 2021-02-02 | X | 3 |
| 2021-02-02 | Y | 5 |

如何使用 Nest 执行此查询?

已更新:

索引映射:

{
  "samples" : {
    "mappings" : {
      "properties" : {
        "count" : {
          "type" : "long"
        },
        "date" : {
          "type" : "date"
        },
        "value" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    }
  }
}

一个月前我遇到了同样的问题,假设你的class是这样的:

 public class Sample
  {
    public string Date { get; set; }

    public string Value { get; set; }

    public int Count { get; set; }
  }

最后的结果是这样的:

 public class Report
  {
    public string Date { get; set; }

    public string Value { get; set; }

    public double? SumCount { get; set; }
  }

并在 elastic 上搜索:

public async Task<List<Report>> GetReportAsync(CancellationToken token)
        {
            var result = await _elasticClient.SearchAsync<Sample>(search => search
                .Aggregations(agg => agg
                    .Terms("result", t => t
                        .Script(sc => sc.Source("doc['date'].value+'#'+doc['value'].value").Lang(ScriptLang.Painless))
                        .Aggregations(a => a
                            .Sum("SumCount", s => s.Field(f => f.Count))
                        )))
                .Source(false)
                .Size(0), token);

            return result.Aggregations.Terms("result").Buckets.Select(x => new Report
            {
                Date = x.Key.Split(new[] { "#" }, StringSplitOptions.RemoveEmptyEntries)[0],
                Value = x.Key.Split(new[] { "#" }, StringSplitOptions.RemoveEmptyEntries)[1],
                SumCount = ((ValueAggregate)x["SumCount"])?.Value
            }).ToList();
        }

在 Elasticsearch 6.1 及更高版本中,Composite aggregation 是正确的选择

private static void Main()
{
    var default_index = "tmp";
    var pool = new SingleNodeConnectionPool(new Uri($"http://localhost:9200"));
    var settings = new ConnectionSettings(pool)
        .DefaultIndex(default_index);
        
    var client = new ElasticClient(settings);

    if (client.Indices.Exists(default_index).Exists)
        client.Indices.Delete(default_index);
    
    client.Indices.Create(default_index, c => c
        .Map<Tmp>(m => m
            .AutoMap()
        )
    );

    client.IndexMany(new [] 
    {
        new Tmp("2021-01-01", "X", 1),
        new Tmp("2021-01-01", "X", 2),
        new Tmp("2021-01-01", "Y", 1),
        new Tmp("2021-02-02", "X", 1),
        new Tmp("2021-02-02", "X", 2),
        new Tmp("2021-02-02", "Y", 5)
    });

    client.Indices.Refresh(default_index);

    var searchResponse = client.Search<Tmp>(s => s
        .Size(0)
        .Aggregations(a => a
            .Composite("composite", t => t
                .Sources(so => so
                    .DateHistogram("date", d => d.Field(f => f.Date).FixedInterval("1d").Format("yyyy-MM-dd"))
                    .Terms("value", t => t.Field(f => f.Value))
                )
                .Aggregations(aa => aa
                    .Sum("composite_count", su => su
                        .Field(f => f.Count)
                    )   
                )
            )
        )
    );
    
    Console.WriteLine("| Date | Value | Count |");
    foreach (var bucket in searchResponse.Aggregations.Composite("composite").Buckets)
    {
        bucket.Key.TryGetValue("date", out string date);
        bucket.Key.TryGetValue("value", out string value);
        var sum = bucket.Sum("composite_count").Value;
        Console.WriteLine($"| {date} | {value} | {sum} |"); 
    }
}

public class Tmp 
{
    public Tmp(string date, string value, int count)
    {
        Date = DateTime.Parse(date);
        Value = value;
        Count = count;
    }
    
    public Tmp()
    {
    }
    
    public DateTime Date {get;set;}
    
    [Keyword]
    public string Value {get;set;}
    public int Count {get;set;}
}

打印

| Date | Value | Count |
| 2021-01-01 | X | 3 |
| 2021-01-01 | Y | 1 |
| 2021-02-02 | X | 3 |
| 2021-02-02 | Y | 5 |