Elasticsearch - 不同的值,不重要
Elasticsearch - Distinct Values, Not Counts
我正在尝试做类似于此 SQL 查询的事情:
SELECT * FROM table WHERE fileContent LIKE '%keyword%' AND company_id = '1' GROUP BY email
已阅读帖子 我有这个:
{
"query": {
"bool": {
"must": [{
"match": {
"fileContent": {
"query": "keyword"
}
}
}],
"filter": [{
"terms": {
"company_id": [1]
}
}]
}
},
"aggs": {
"group_by_email": {
"terms": {
"field": "email",
"size": 1000
}
}
},
"size": 0
}
字段映射为:
{
"cvs" : {
"mappings" : {
"application" : {
"_meta" : {
"model" : "Acme\AppBundle\Entity\Application"
},
"dynamic_date_formats" : [ ],
"properties" : {
"email" : {
"type" : "keyword"
},
"fileContent" : {
"type" : "text"
},
"company_id" : {
"type" : "text"
}
}
}
}
}
}
... 从 Symfony 生成 config.yml:
fos_elastica:
clients:
default:
host: "%elastica.host%"
port: "%elastica.port%"
indexes:
cvs:
client: default
types:
application:
properties:
fileContent: ~
email:
index: not_analyzed
company_id: ~
persistence:
driver: orm
model: Acme\AppBundle\Entity\Application
provider: ~
finder: ~
过滤器工作正常,但我发现 hits:hits
returns 没有项目(或者如果我删除 size:0
则所有结果匹配搜索)并且 aggregations:group_by_email:buckets
有对组的计数,但不是记录本身。分组的记录没有返回,我需要的就是这些。
如果这是您喜欢的风格,我也尝试过使用查询构建器来使用 FOSElasticBundle(这可行,但没有 grouping/aggregation):
$boolQuery = new \Elastica\Query\BoolQuery();
$filterKeywords = new \Elastica\Query\Match();
$filterKeywords->setFieldQuery('fileContent', 'keyword');
$boolQuery->addMust($filterKeywords);
$filterUser = new \Elastica\Query\Terms();
$filterUser->setTerms('company_id', array('1'));
$boolQuery->addFilter($filterUser);
$finder = $this->get('fos_elastica.finder.cvs.application');
谢谢。
为此,您需要 top_hits
aggregation 在您已经在使用的 terms
中:
"aggs": {
"group_by_email": {
"terms": {
"field": "email",
"size": 1000
},
"aggs": {
"sample_docs": {
"top_hits": {
"size": 100
}
}
}
}
}
top_hits:{size:1}
似乎是我所需要的,玩过安德烈的答案。这将为聚合中的每个桶 return 一条记录
"aggs": {
"group_by_email": {
"terms": {
"field": "email",
"size": 1000
},
"aggs": {
"sample_docs": {
"top_hits": {
"size": 1
}
}
}
}
}
参考:top_hits
top_hits
也帮了我。我也遇到了一些麻烦,但最终想出了解决办法。 所以这是我的解决方案:
{
"query": {
"nested": {
"path": "placedOrders",
"query": {
"bool": {
"must": [
{
"term": {
"placedOrders.ownerId": "0a9fdef0-4508-4f9c-aa8c-b3984e39ad1e"
}
}
]
}
}
}
},
"aggs": {
"custom_name1": {
"nested": {
"path": "placedOrders"
},
"aggs": {
"custom_name2": {
"terms": {
"field": "placedOrders.propertyId"
},
"aggs": {
"custom_name3": {
"top_hits": {
"size": 1,
"sort": [
{
"placedOrders.propertyId": {
"order": "desc"
}
}
]
}
}
}
}
}
}
}
}
我正在尝试做类似于此 SQL 查询的事情:
SELECT * FROM table WHERE fileContent LIKE '%keyword%' AND company_id = '1' GROUP BY email
已阅读帖子
{
"query": {
"bool": {
"must": [{
"match": {
"fileContent": {
"query": "keyword"
}
}
}],
"filter": [{
"terms": {
"company_id": [1]
}
}]
}
},
"aggs": {
"group_by_email": {
"terms": {
"field": "email",
"size": 1000
}
}
},
"size": 0
}
字段映射为:
{
"cvs" : {
"mappings" : {
"application" : {
"_meta" : {
"model" : "Acme\AppBundle\Entity\Application"
},
"dynamic_date_formats" : [ ],
"properties" : {
"email" : {
"type" : "keyword"
},
"fileContent" : {
"type" : "text"
},
"company_id" : {
"type" : "text"
}
}
}
}
}
}
... 从 Symfony 生成 config.yml:
fos_elastica:
clients:
default:
host: "%elastica.host%"
port: "%elastica.port%"
indexes:
cvs:
client: default
types:
application:
properties:
fileContent: ~
email:
index: not_analyzed
company_id: ~
persistence:
driver: orm
model: Acme\AppBundle\Entity\Application
provider: ~
finder: ~
过滤器工作正常,但我发现 hits:hits
returns 没有项目(或者如果我删除 size:0
则所有结果匹配搜索)并且 aggregations:group_by_email:buckets
有对组的计数,但不是记录本身。分组的记录没有返回,我需要的就是这些。
如果这是您喜欢的风格,我也尝试过使用查询构建器来使用 FOSElasticBundle(这可行,但没有 grouping/aggregation):
$boolQuery = new \Elastica\Query\BoolQuery();
$filterKeywords = new \Elastica\Query\Match();
$filterKeywords->setFieldQuery('fileContent', 'keyword');
$boolQuery->addMust($filterKeywords);
$filterUser = new \Elastica\Query\Terms();
$filterUser->setTerms('company_id', array('1'));
$boolQuery->addFilter($filterUser);
$finder = $this->get('fos_elastica.finder.cvs.application');
谢谢。
为此,您需要 top_hits
aggregation 在您已经在使用的 terms
中:
"aggs": {
"group_by_email": {
"terms": {
"field": "email",
"size": 1000
},
"aggs": {
"sample_docs": {
"top_hits": {
"size": 100
}
}
}
}
}
top_hits:{size:1}
似乎是我所需要的,玩过安德烈的答案。这将为聚合中的每个桶 return 一条记录
"aggs": {
"group_by_email": {
"terms": {
"field": "email",
"size": 1000
},
"aggs": {
"sample_docs": {
"top_hits": {
"size": 1
}
}
}
}
}
参考:top_hits
top_hits
也帮了我。我也遇到了一些麻烦,但最终想出了解决办法。 所以这是我的解决方案:
{
"query": {
"nested": {
"path": "placedOrders",
"query": {
"bool": {
"must": [
{
"term": {
"placedOrders.ownerId": "0a9fdef0-4508-4f9c-aa8c-b3984e39ad1e"
}
}
]
}
}
}
},
"aggs": {
"custom_name1": {
"nested": {
"path": "placedOrders"
},
"aggs": {
"custom_name2": {
"terms": {
"field": "placedOrders.propertyId"
},
"aggs": {
"custom_name3": {
"top_hits": {
"size": 1,
"sort": [
{
"placedOrders.propertyId": {
"order": "desc"
}
}
]
}
}
}
}
}
}
}
}