Spring 数据 - MongoDB - 使用 GroupOperation 在聚合管道中进行文本搜索和总分

Question

我有以下单个文档：

    @Document
    public class Comment {

        @TextIndexed(weight = 1)
        private String text;     // the actual comment itself

        @TextIndexed(weight = 5)
        private String topic;    // the topic where this comment belongs

        ....
     }

首先是用例：很简单；话题有很多，一个话题可能有很多评论。（是的，出于其他几个原因，我一直在评论文档中冗余地保存该主题。）

我想在这里实现的是，在我的 UI 上有一个主题搜索栏，相关主题显示在建议列表中 。这句话的重点是这部分“relevant topics”。所以它不应该只是在主题内进行虚拟搜索，还应该考虑评论（文本属性）。

例如，我有这些主题和评论：

Topic: Donald Trump

Comment1: the guy... Comment2: the president of... Comment3: here another comment

Donald 在该主题中只存在一次（因此总权重为 5）

Topic: Most powerful people

Comment1: the first comment Comment2: the president Donald... Comment3: Donald Trump Comment4: why Donald ..... Comment56: Donald ..

Donald 本身不存在于这个主题中，但是它有这么多评论出现在 Donald 上（总权重为 45 即），我们的搜索也必须找到这个主题并在我们的建议列表中建议它，它甚至应该显示它在第一个主题之前，因为分数更高。

所以我已经用我认为有意义的方式用一些 TextIndexed 注释和一些权重标记了我的文档字段。所以我必须进行文本搜索，而且我也确定我必须在某处使用 groupOperation 才能将主题作为结果。但我不知道如何通过简单的聚合实现所有这些。

感谢任何帮助。

编辑：我现在有……像这样，但它并不完全有效。

@TextScore
private Float score;  // a new field in Comment Document to store the score


@Service
public class CommentService {

    ...

    public Slice<TopicSuggestion> searchTopic(final String searchString) {

        TextCriteria criteria = TextCriteria.forDefaultLanguage().matchingAny(searchString);
        MatchOperation match = match(criteria);
        GroupOperation groupByTopicAndSumScore = group("topic").sum("score").as("score");
        SortOperation sortByScore = sort(Sort.Direction.DESC, "score");
        LimitOperation limit = limit(10);
        ProjectionOperation project = project()
                .andExpression("_id").as("topic")
                .andExpression("score").as("score");

        Aggregation aggregation = newAggregation(match, groupByTopicAndSumScore, sortByScore, limit, project);
        List<TopicSuggestion> result = mongoTemplate.aggregate(aggregation, Comment.class, TopicSuggestion.class).getMappedResults();

        return new SliceImpl<TopicSuggestion>(result);
    }
}

OutputType TopicSuggestion 只有 2 个字段，主题和分数。

但这是我现在的输出（分数为 0.0，排序不正确）：

"content": [
    {
        "topic": "Donald Trump",
        "score": 0.0
    },
    {
        "topic": "Most powerful people",
        "score": 0.0
    }
]

Answer 1

您可以使用以下聚合。

您需要投射文本乐谱，然后是分组。聚合查询中的 Projection class 中不支持通过 helper 方法添加文本分数。它仅支持常规查找查询。

TextCriteria criteria = TextCriteria.forDefaultLanguage().matchingAny(searchString);
MatchOperation match = match(criteria);
ProjectionOperation project1 = project("topic").and(aggregationOperationContext -> new Document("$meta", "textScore")).as("score");
GroupOperation groupByTopicAndSumScore = group("topic").sum("score").as("score");
SortOperation sortByScore = sort(Sort.Direction.DESC, "score");
LimitOperation limit = limit(10);
ProjectionOperation project2 = project("score").and("_id").as("topic");

Aggregation aggregation = newAggregation(match, project1, groupByTopicAndSumScore, sortByScore, limit, project2);
List<TopicSuggestion> result = mongoTemplate.aggregate(aggregation, Comment.class, TopicSuggestion.class).getMappedResults();

Spring 数据 - MongoDB - 使用 GroupOperation 在聚合管道中进行文本搜索和总分

Spring Data - MongoDB - Text Search und sum score in the Aggregation Pipeline with GroupOperation

mongodb

spring-data

aggregation-framework

spring-data-mongodb