使用 Jackson 解析 ElasticSearch 输出

Parse ElasticSearch output with Jackson

我想解析 ElasticSearch 输出的 _source 字段。这是我的一个例子(它只包含一个值列表):

"_source":
{    
   "key1": "value1",    
   "key2": "value2"
},
{    
   "key1": "value1",    
   "key2": "value2"
},
etc.

我知道如何到达 _source 但我不知道如何解析它。好像是单节点吧?

编辑:

我尝试 'reach' _source 字段,但它似乎不起作用:

final ArrayNode _source = (ArrayNode) jsonNode.path(ES_HITS).path(ES_HITS).path(ES_SOURCE);
for (JsonNode value : _source)
{
        try 
        {
            lov.add(mapper.treeToValue(value, Lov.class));
        } catch (JsonProcessingException e) {   logger.error("GetLibelles : add : error : JsonProcessingException", e); }
        }

爱class

@JsonIgnoreProperties(ignoreUnknown = true)
public class Lov extends ParentModel implements Serializable
{   
    private String key1;
    private String key2;
    private String key3;
    private String key4;

    // getters and setters
}

我遇到的错误:

com.fasterxml.jackson.databind.node.MissingNode incompatible with com.fasterxml.jackson.databind.node.ArrayNode

ElasticSearch 输出:

{
 "took":0,
 "timed_out":false,
 "_shards":
 {
    "total":1,
    "successful":1,
    "failed":0
 },
"hits":
{ 
   "total":1,
   "max_score":1.0,
   "hits":
    [
       {
          "_index":"bla",
          "_type":"lov",
          "_id":"PWA8bmEBRDuys8JUCwg10w",
          "_score":1.0,
          "_source":
          {    
              "key1": "value1",    
              "key2": "value2"
          },
          {    
              "key1": "value1",    
              "key2": "value2"
          }
       } 
    ]
}}

如果您的查询 returns 多次匹配,则“_source”属性 将出现在每个返回的匹配中。 (请参阅 documentation 中的此处)

要用 jackson 解析 json,只需制作一个与 json 模式匹配的 POJO。在您的情况下,这应该是包含属性 key1 和 key2 的 class (Result.java)。然后使用 jackson ObjectMapper:

将 json 字符串映射到您的 pojo class
ObjectMapper mapper = new ObjectMapper();
Result result = mapper.readValue("{\"key1\":\"value1\",..}",result.class);

在“_source”后面 属性 通常应该只有一个对象,我想。您提供的代码是来自真实用例,还是只是一个示例?

我找到了解决方案。映射很好,但插入不是。要正确插入多个文档,我必须使用 Bulk API.

映射完成后,我必须使用以下命令插入数据:

curl -s -XPOST 'serverAddress/_bulk' --data-binary @data.json; echo

data.json

{ "index" : { "_index" : "yourIndex", "_type" : "lov"}}
{ "key1": "value1", "key2": "value2"}
{ "index" : { "_index" : "yourIndex", "_type" : "lov"}}
{ "key1": "value1", "key2": "value2"}

In the same way that mget allows us to retrieve multiple documents at once, the bulk API allows us to make multiple create, index, update, or delete requests in a single step.

我需要插入数据,因此我选择 index 操作。 每个请求都需要一个操作

别忘了:

  1. Every line must end with a newline character (\n), including the last line. These are used as markers to allow for efficient line separation.
  2. The lines cannot contain unescaped newline characters, as they would interfere with parsing. This means that the JSON must not be pretty-printed.