用于带状疱疹的 Elasticsearch 处理器类似于拆分?
Elasticsearch processor for shingles similar to split?
是否有可以处理带状疱疹的处理器,或者我能以某种方式定制一个吗?
在下面的流水线处理器中,我拆分了 space 个字符,但我也想像 shingle 分析器那样组合单词:
PUT _ingest/pipeline/split
{
"processors": [
{
"split": {
"field": "title",
"target_field": "title_suggest.input",
"separator": "\s+"
}
}
]
}
示例:
“高级业务开发人员”需要包含这些条款的建议字段。
- 高级业务开发人员
- 业务开发人员
- 开发人员
以下是引发此问题的文章和答案的链接:
这是我使用自定义脚本想出的一个解决方案:
PUT _ingest/pipeline/shingle
{
"description" : "Create basic shingles from title field and input in another field title_suggest",
"processors" : [
{
"script": {
"lang": "painless",
"source": """
String[] split(String s, char d) {
int count = 0;
for (char c : s.toCharArray()) {
if (c == d) {
++count;
}
}
if (count == 0) {
return new String[] {s};
}
String[] r = new String[count + 1];
int i0 = 0, i1 = 0;
count = 0;
for (char c : s.toCharArray()) {
if (c == d) {
r[count++] = s.substring(i0, i1);
i0 = i1 + 1;
}
++i1;
}
r[count] = s.substring(i0, i1);
return r;
}
if (!ctx.containsKey('title')) { return; }
def title_words = split(ctx['title'], (char)' ');
def title_suggest = [];
for (def i = 0; i < title_words.length; i++) {
def shingle = title_words[i];
title_suggest.add(shingle);
for (def j = i + 1; j < title_words.length; j++) {
shingle = shingle + ' ' + title_words[j];
title_suggest.add(shingle);
}
}
ctx['title_suggest'] = title_suggest;
"""
}
}
]
}
是否有可以处理带状疱疹的处理器,或者我能以某种方式定制一个吗?
在下面的流水线处理器中,我拆分了 space 个字符,但我也想像 shingle 分析器那样组合单词:
PUT _ingest/pipeline/split
{
"processors": [
{
"split": {
"field": "title",
"target_field": "title_suggest.input",
"separator": "\s+"
}
}
]
}
示例:
“高级业务开发人员”需要包含这些条款的建议字段。
- 高级业务开发人员
- 业务开发人员
- 开发人员
以下是引发此问题的文章和答案的链接:
这是我使用自定义脚本想出的一个解决方案:
PUT _ingest/pipeline/shingle
{
"description" : "Create basic shingles from title field and input in another field title_suggest",
"processors" : [
{
"script": {
"lang": "painless",
"source": """
String[] split(String s, char d) {
int count = 0;
for (char c : s.toCharArray()) {
if (c == d) {
++count;
}
}
if (count == 0) {
return new String[] {s};
}
String[] r = new String[count + 1];
int i0 = 0, i1 = 0;
count = 0;
for (char c : s.toCharArray()) {
if (c == d) {
r[count++] = s.substring(i0, i1);
i0 = i1 + 1;
}
++i1;
}
r[count] = s.substring(i0, i1);
return r;
}
if (!ctx.containsKey('title')) { return; }
def title_words = split(ctx['title'], (char)' ');
def title_suggest = [];
for (def i = 0; i < title_words.length; i++) {
def shingle = title_words[i];
title_suggest.add(shingle);
for (def j = i + 1; j < title_words.length; j++) {
shingle = shingle + ' ' + title_words[j];
title_suggest.add(shingle);
}
}
ctx['title_suggest'] = title_suggest;
"""
}
}
]
}