使用 Jackson 解析 ElasticSearch 输出
Parse ElasticSearch output with Jackson
我想解析 ElasticSearch 输出的 _source
字段。这是我的一个例子(它只包含一个值列表):
"_source":
{
"key1": "value1",
"key2": "value2"
},
{
"key1": "value1",
"key2": "value2"
},
etc.
我知道如何到达 _source
但我不知道如何解析它。好像是单节点吧?
编辑:
我尝试 'reach' _source
字段,但它似乎不起作用:
final ArrayNode _source = (ArrayNode) jsonNode.path(ES_HITS).path(ES_HITS).path(ES_SOURCE);
for (JsonNode value : _source)
{
try
{
lov.add(mapper.treeToValue(value, Lov.class));
} catch (JsonProcessingException e) { logger.error("GetLibelles : add : error : JsonProcessingException", e); }
}
爱class
@JsonIgnoreProperties(ignoreUnknown = true)
public class Lov extends ParentModel implements Serializable
{
private String key1;
private String key2;
private String key3;
private String key4;
// getters and setters
}
我遇到的错误:
com.fasterxml.jackson.databind.node.MissingNode incompatible with com.fasterxml.jackson.databind.node.ArrayNode
ElasticSearch 输出:
{
"took":0,
"timed_out":false,
"_shards":
{
"total":1,
"successful":1,
"failed":0
},
"hits":
{
"total":1,
"max_score":1.0,
"hits":
[
{
"_index":"bla",
"_type":"lov",
"_id":"PWA8bmEBRDuys8JUCwg10w",
"_score":1.0,
"_source":
{
"key1": "value1",
"key2": "value2"
},
{
"key1": "value1",
"key2": "value2"
}
}
]
}}
如果您的查询 returns 多次匹配,则“_source”属性 将出现在每个返回的匹配中。 (请参阅 documentation 中的此处)
要用 jackson 解析 json,只需制作一个与 json 模式匹配的 POJO。在您的情况下,这应该是包含属性 key1 和 key2 的 class (Result.java)。然后使用 jackson ObjectMapper:
将 json 字符串映射到您的 pojo class
ObjectMapper mapper = new ObjectMapper();
Result result = mapper.readValue("{\"key1\":\"value1\",..}",result.class);
在“_source”后面 属性 通常应该只有一个对象,我想。您提供的代码是来自真实用例,还是只是一个示例?
我找到了解决方案。映射很好,但插入不是。要正确插入多个文档,我必须使用 Bulk API.
映射完成后,我必须使用以下命令插入数据:
curl -s -XPOST 'serverAddress/_bulk' --data-binary @data.json; echo
data.json
{ "index" : { "_index" : "yourIndex", "_type" : "lov"}}
{ "key1": "value1", "key2": "value2"}
{ "index" : { "_index" : "yourIndex", "_type" : "lov"}}
{ "key1": "value1", "key2": "value2"}
In the same way that mget allows us to retrieve multiple documents at once, the bulk API allows us to make multiple create, index, update, or delete requests in a single step.
我需要插入数据,因此我选择 index
操作。 每个请求都需要一个操作。
别忘了:
- Every line must end with a newline character (\n), including the last line. These are used as markers to allow for efficient line separation.
- The lines cannot contain unescaped newline characters, as they would interfere with parsing. This means that the JSON must not be pretty-printed.
我想解析 ElasticSearch 输出的 _source
字段。这是我的一个例子(它只包含一个值列表):
"_source":
{
"key1": "value1",
"key2": "value2"
},
{
"key1": "value1",
"key2": "value2"
},
etc.
我知道如何到达 _source
但我不知道如何解析它。好像是单节点吧?
编辑:
我尝试 'reach' _source
字段,但它似乎不起作用:
final ArrayNode _source = (ArrayNode) jsonNode.path(ES_HITS).path(ES_HITS).path(ES_SOURCE);
for (JsonNode value : _source)
{
try
{
lov.add(mapper.treeToValue(value, Lov.class));
} catch (JsonProcessingException e) { logger.error("GetLibelles : add : error : JsonProcessingException", e); }
}
爱class
@JsonIgnoreProperties(ignoreUnknown = true)
public class Lov extends ParentModel implements Serializable
{
private String key1;
private String key2;
private String key3;
private String key4;
// getters and setters
}
我遇到的错误:
com.fasterxml.jackson.databind.node.MissingNode incompatible with com.fasterxml.jackson.databind.node.ArrayNode
ElasticSearch 输出:
{
"took":0,
"timed_out":false,
"_shards":
{
"total":1,
"successful":1,
"failed":0
},
"hits":
{
"total":1,
"max_score":1.0,
"hits":
[
{
"_index":"bla",
"_type":"lov",
"_id":"PWA8bmEBRDuys8JUCwg10w",
"_score":1.0,
"_source":
{
"key1": "value1",
"key2": "value2"
},
{
"key1": "value1",
"key2": "value2"
}
}
]
}}
如果您的查询 returns 多次匹配,则“_source”属性 将出现在每个返回的匹配中。 (请参阅 documentation 中的此处)
要用 jackson 解析 json,只需制作一个与 json 模式匹配的 POJO。在您的情况下,这应该是包含属性 key1 和 key2 的 class (Result.java)。然后使用 jackson ObjectMapper:
将 json 字符串映射到您的 pojo classObjectMapper mapper = new ObjectMapper();
Result result = mapper.readValue("{\"key1\":\"value1\",..}",result.class);
在“_source”后面 属性 通常应该只有一个对象,我想。您提供的代码是来自真实用例,还是只是一个示例?
我找到了解决方案。映射很好,但插入不是。要正确插入多个文档,我必须使用 Bulk API.
映射完成后,我必须使用以下命令插入数据:
curl -s -XPOST 'serverAddress/_bulk' --data-binary @data.json; echo
data.json
{ "index" : { "_index" : "yourIndex", "_type" : "lov"}}
{ "key1": "value1", "key2": "value2"}
{ "index" : { "_index" : "yourIndex", "_type" : "lov"}}
{ "key1": "value1", "key2": "value2"}
In the same way that mget allows us to retrieve multiple documents at once, the bulk API allows us to make multiple create, index, update, or delete requests in a single step.
我需要插入数据,因此我选择 index
操作。 每个请求都需要一个操作。
别忘了:
- Every line must end with a newline character (\n), including the last line. These are used as markers to allow for efficient line separation.
- The lines cannot contain unescaped newline characters, as they would interfere with parsing. This means that the JSON must not be pretty-printed.