使用匹配所有查询从 Elastic 搜索索引中获取有限数据 python
fetch limited data from Elastic search indices using match all query python
我正在编写一个 python 程序来从弹性搜索索引中获取数据。我想根据我指出的最多 25 个匹配查询获取数据。我想要前 25 个数据。我索引中的数据是 10842。但它从弹性搜索的索引中检索所有数据。我从这里 检查了解决方案,但它对我没有帮助。帮我解决一些问题
代码如下:
from elasticsearch import Elasticsearch
import elasticsearch.helpers
count = 0
host = 'localhost'
ind = 'apps'
doc_typ = "change_apps"
limit_count = 25
def elasticsearch_import(host,ind,doc_typ,count,limit_count,port=9200,query={},single_line=False,single_line_label="message"):
data_count=count+limit_count
print("Data to be get from Elastic Search: ",data_count)
es = Elasticsearch()
results = elasticsearch.helpers.scan(es,
index=ind,
doc_type=doc_typ,
preserve_order=True,
query={"from":count,"size":data_count,"query": {"bool": {"must": [{"match_all": {}}],"must_not": [],"should": [] }},})
res=[]
for i in results:
res.append(i)
#print("res",res)
print("Data got from Elastic Search",len(res))
elasticsearch_import(host,ind,doc_typ,count,limit_count)
我得到的输出:
Data to be get from Elastic Search: 25
Data got from Elastic Search 10842
所需输出:
Data to be get from Elastic Search: 25
Data got from Elastic Search 25
这就是scan
方法的作用...它使用scroll
method under the hood, if you look into the api documentation,size
实际上意味着batch size
。
size – size (per shard) of the batch send at each iteration.
如果你只是想得到一个大小的结果,search
就足够了,在这种情况下,size
是结果大小,默认值是10
.
我正在编写一个 python 程序来从弹性搜索索引中获取数据。我想根据我指出的最多 25 个匹配查询获取数据。我想要前 25 个数据。我索引中的数据是 10842。但它从弹性搜索的索引中检索所有数据。我从这里
代码如下:
from elasticsearch import Elasticsearch
import elasticsearch.helpers
count = 0
host = 'localhost'
ind = 'apps'
doc_typ = "change_apps"
limit_count = 25
def elasticsearch_import(host,ind,doc_typ,count,limit_count,port=9200,query={},single_line=False,single_line_label="message"):
data_count=count+limit_count
print("Data to be get from Elastic Search: ",data_count)
es = Elasticsearch()
results = elasticsearch.helpers.scan(es,
index=ind,
doc_type=doc_typ,
preserve_order=True,
query={"from":count,"size":data_count,"query": {"bool": {"must": [{"match_all": {}}],"must_not": [],"should": [] }},})
res=[]
for i in results:
res.append(i)
#print("res",res)
print("Data got from Elastic Search",len(res))
elasticsearch_import(host,ind,doc_typ,count,limit_count)
我得到的输出:
Data to be get from Elastic Search: 25
Data got from Elastic Search 10842
所需输出:
Data to be get from Elastic Search: 25
Data got from Elastic Search 25
这就是scan
方法的作用...它使用scroll
method under the hood, if you look into the api documentation,size
实际上意味着batch size
。
size – size (per shard) of the batch send at each iteration.
如果你只是想得到一个大小的结果,search
就足够了,在这种情况下,size
是结果大小,默认值是10
.