用于带状疱疹的 Elasticsearch 处理器类似于拆分?

Elasticsearch processor for shingles similar to split?

是否有可以处理带状疱疹的处理器,或者我能以某种方式定制一个吗?

在下面的流水线处理器中,我拆分了 space 个字符,但我也想像 shingle 分析器那样组合单词:

PUT _ingest/pipeline/split
{
  "processors": [
    {
      "split": {
        "field": "title",
        "target_field": "title_suggest.input",
        "separator": "\s+"
      }
    }
  ]
}

示例:

“高级业务开发人员”需要包含这些条款的建议字段。

  1. 高级业务开发人员
  2. 业务开发人员
  3. 开发人员

以下是引发此问题的文章和答案的链接:

  1. https://blog.mimacom.com/autocomplete-elasticsearch-part3/

这是我使用自定义脚本想出的一个解决方案:

PUT _ingest/pipeline/shingle
{
  "description" : "Create basic shingles from title field and input in another field title_suggest",
  "processors" : [
    {
      "script": {
        "lang": "painless",
        "source": """
              String[] split(String s, char d) {                                   
                int count = 0;
            
                for (char c : s.toCharArray()) {                                 
                    if (c == d) {
                        ++count;
                    }
                }
            
                if (count == 0) {
                    return new String[] {s};                                     
                }
            
                String[] r = new String[count + 1];                              
                int i0 = 0, i1 = 0;
                count = 0;
            
                for (char c : s.toCharArray()) {                                 
                    if (c == d) {
                        r[count++] = s.substring(i0, i1);
                        i0 = i1 + 1;
                    }
            
                    ++i1;
                }
            
                r[count] = s.substring(i0, i1);                                  
            
                return r;
              }
              
              if (!ctx.containsKey('title')) { return; }
              def title_words = split(ctx['title'], (char)' ');
              def title_suggest = [];
              for (def i = 0; i < title_words.length; i++) {
                def shingle = title_words[i];
                title_suggest.add(shingle);
                for (def j = i + 1; j < title_words.length; j++) {
                  shingle = shingle + ' ' + title_words[j];
                  title_suggest.add(shingle);
                }
              }
              ctx['title_suggest'] = title_suggest;
              
            """
      }
    }
  ]
}