AbstractElasticsearchRepository:无法加载 elasticsearch 节点:MapperParsingException:找不到字段的分析器 [autocomplete_index]

AbstractElasticsearchRepository : failed to load elasticsearch nodes :MapperParsingException: analyzer [autocomplete_index] not found for field

用户案例:我想按照用户类型搜索模式使用 SpringBoot 和 ElasticSearch 开发微服务。换句话说,如果我输入 "d",我想回复 Demetrio、Denis、Daniel。输入第二个字母 "e" 会带来 Demetrio 和 Denis,最后第三个字母将检索确切的姓名。即使输入中间字母也应该带上。 "en" 应该带上 Denis 和 Daniel。作为用户类型的搜索非常常见。

我正在研究在以下位置找到的建议:

edgengram

search-as-you-type field type

search-analyzer

当前问题:当我启动旨在创建和设置 ElasticSearch 的应用程序时,我从这个问题主题中得到了异常。索引已成功创建并加载了我的初始数据,但似乎完全忽略了分析器。

启动 SpringBoot 时的完整日志:

2020-04-10 14:27:40.281  INFO 16556 --- [           main] com.poc.search.SearchApplication         : Starting SearchApplication on SPANOT164 with PID 16556 (C:\WSs\elasticsearch\search\target\classes started by Cast in C:\WSs\elasticsearch\search)
2020-04-10 14:27:40.286  INFO 16556 --- [           main] com.poc.search.SearchApplication         : No active profile set, falling back to default profiles: default
2020-04-10 14:27:40.863  INFO 16556 --- [           main] .s.d.r.c.RepositoryConfigurationDelegate : Bootstrapping Spring Data Elasticsearch repositories in DEFAULT mode.
2020-04-10 14:27:40.931  INFO 16556 --- [           main] .s.d.r.c.RepositoryConfigurationDelegate : Finished Spring Data repository scanning in 62ms. Found 1 Elasticsearch repository interfaces.
2020-04-10 14:27:41.101  INFO 16556 --- [           main] .s.d.r.c.RepositoryConfigurationDelegate : Bootstrapping Spring Data Reactive Elasticsearch repositories in DEFAULT mode.
2020-04-10 14:27:41.120  INFO 16556 --- [           main] .s.d.r.c.RepositoryConfigurationDelegate : Finished Spring Data repository scanning in 13ms. Found 0 Reactive Elasticsearch repository interfaces.
2020-04-10 14:27:42.343  INFO 16556 --- [           main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat initialized with port(s): 8080 (http)
2020-04-10 14:27:42.360  INFO 16556 --- [           main] o.apache.catalina.core.StandardService   : Starting service [Tomcat]
2020-04-10 14:27:42.360  INFO 16556 --- [           main] org.apache.catalina.core.StandardEngine  : Starting Servlet engine: [Apache Tomcat/9.0.33]
2020-04-10 14:27:42.496  INFO 16556 --- [           main] o.a.c.c.C.[Tomcat].[localhost].[/]       : Initializing Spring embedded WebApplicationContext
2020-04-10 14:27:42.496  INFO 16556 --- [           main] o.s.web.context.ContextLoader            : Root WebApplicationContext: initialization completed in 2122 ms
2020-04-10 14:27:43.221  INFO 16556 --- [           main] o.elasticsearch.plugins.PluginsService   : no modules loaded
2020-04-10 14:27:43.222  INFO 16556 --- [           main] o.elasticsearch.plugins.PluginsService   : loaded plugin [org.elasticsearch.index.reindex.ReindexPlugin]
2020-04-10 14:27:43.222  INFO 16556 --- [           main] o.elasticsearch.plugins.PluginsService   : loaded plugin [org.elasticsearch.join.ParentJoinPlugin]
2020-04-10 14:27:43.222  INFO 16556 --- [           main] o.elasticsearch.plugins.PluginsService   : loaded plugin [org.elasticsearch.percolator.PercolatorPlugin]
2020-04-10 14:27:43.222  INFO 16556 --- [           main] o.elasticsearch.plugins.PluginsService   : loaded plugin [org.elasticsearch.script.mustache.MustachePlugin]
2020-04-10 14:27:43.222  INFO 16556 --- [           main] o.elasticsearch.plugins.PluginsService   : loaded plugin [org.elasticsearch.transport.Netty4Plugin]
2020-04-10 14:27:45.480  INFO 16556 --- [           main] o.s.d.e.c.TransportClientFactoryBean     : Adding transport node : 127.0.0.1:9300
2020-04-10 14:27:47.539 ERROR 16556 --- [           main] .d.e.r.s.AbstractElasticsearchRepository : failed to load elasticsearch nodes : org.elasticsearch.index.mapper.MapperParsingException: analyzer [autocomplete_index] not found for field [palavra]
2020-04-10 14:27:47.775  INFO 16556 --- [           main] o.s.s.concurrent.ThreadPoolTaskExecutor  : Initializing ExecutorService 'applicationTaskExecutor'
2020-04-10 14:27:48.333  INFO 16556 --- [           main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat started on port(s): 8080 (http) with context path ''
2020-04-10 14:27:48.334  INFO 16556 --- [           main] com.poc.search.SearchApplication         : Started SearchApplication in 8.714 seconds (JVM running for 9.159)

弹性-analyzer.json 来自 resources/data/es-config

{
  "analysis": {
    "filter": {
      "autocomplete_filter": {
        "type": "edge_ngram",
        "min_gram": 1,
        "max_gram": 20
      }
    },
    "analyzer": {
      "autocomplete_search": {
        "type": "custom",
        "tokenizer": "standard",
        "filter": [
          "lowercase"
        ]
      },
      "autocomplete_index": {
        "type": "custom",
        "tokenizer": "standard",
        "filter": [
          "lowercase",
          "autocomplete_filter"
        ]
      }
    }
  }
}

ElasticSearchLoader

import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.type.CollectionType;
import com.fasterxml.jackson.databind.type.TypeFactory;
import com.poc.search.model.Correntista;
import com.poc.search.service.CorrentistaService;

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.boot.CommandLineRunner;
import org.springframework.core.io.Resource;
import org.springframework.stereotype.Component;

import java.io.IOException;
import java.util.List;
import java.util.UUID;
import java.util.stream.Collectors;

@Component
public class ElasticSearchDataLoader implements CommandLineRunner {

    @Value("classpath:data/correntistas.json")
    private Resource usersJsonFile;

    @Autowired
    private CorrentistaService correntistaService;

    @Override
    public void run(String... args) throws Exception {
        if (this.isInitialized()) {
            return;
        }

        List<Correntista> users = this.loadUsersFromFile();
        users.forEach(correntistaService::save);
    }

    private List<Correntista> loadUsersFromFile() throws IOException {
        ObjectMapper objectMapper = new ObjectMapper();
        CollectionType collectionType = TypeFactory.defaultInstance().constructCollectionType(List.class, CorrentistaInitData.class);
        List<CorrentistaInitData> allFakeUsers = objectMapper.readValue(this.usersJsonFile.getFile(), collectionType);
        return allFakeUsers.stream().map(this::from).map(this::generateId).collect(Collectors.toList());
    }

    private Correntista generateId(Correntista correntista) {
        correntista.setId(UUID.randomUUID().toString());
        return correntista;
    }

    private Correntista from(CorrentistaInitData correntistaJson) {
        Correntista correntista = new Correntista();
        correntista.setConta(correntistaJson.getConta());
        correntista.setSobrenome(correntistaJson.getSobrenome());
        correntista.setPalavra(correntistaJson.getNome());
        return correntista;
    }

    private boolean isInitialized() {
        return this.correntistaService.count() > 0;
    }
}

Correntista 模型

@Document(indexName = "correntistas")
@Setting(settingPath = "es-config/elastic-analyzer.json")
@Getter
@Setter
public class Correntista {
    @Id
    private String id;
    private String conta;
    private String sobrenome;

    @Field(type = FieldType.Text, analyzer = "autocomplete_index", searchAnalyzer = "autocomplete_search")
    private String palavra;
}

application.yml

spring:
  data:
    elasticsearch:
      cluster-name: docker-cluster
      cluster-nodes: localhost:9300

应用启动:

@EnableElasticsearchRepositories
@SpringBootApplication
public class SearchApplication {

    public static void main(String[] args) {
        SpringApplication.run(SearchApplication.class, args);
    }

}

弹性指数设置

{
    "correntistas": {
        "settings": {
            "index": {
                "refresh_interval": "1s",
                "number_of_shards": "5",
                "provided_name": "correntistas",
                "creation_date": "1586539666845",
                "store": {
                    "type": "fs"
                },
                "number_of_replicas": "1",
                "uuid": "2eEha4aMQm2bdut4pd0aAg",
                "version": {
                    "created": "6080499"
                }
            }
        }
    }
}

所有数据最初都按预期加载

{
  "took": 66,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1.0,
    "hits": [
      {
        "_index": "correntistas",
        "_type": "correntista",
        "_id": "7353cd8c-791d-47f5-90b6-a1b5bcf83853",
        "_score": 1.0,
        "_source": {
          "id": "7353cd8c-791d-47f5-90b6-a1b5bcf83853",
          "conta": "1234",
          "sobrenome": "Carvalho",
          "palavra": "Demetrio"
        }
      },
      {
        "_index": "correntistas",
        "_type": "correntista",
        "_id": "122db1bc-584d-4bef-b5ea-3d9e0d42448e",
        "_score": 1.0,
        "_source": {
          "id": "122db1bc-584d-4bef-b5ea-3d9e0d42448e",
          "conta": "5678",
          "sobrenome": "Carv",
          "palavra": "Deme"
        }
      }
    ]
  }
}

所以,我的主要问题是:为什么在成功创建索引时没有创建分析器?周围的问题是为什么它弹出 "failed to load elasticsearch nodes" 因为数据已正确加载?

在您对所写文件的描述中:

elastic-analyzer.json from resources/data/es-config

但是在您的 @Setting 注释中,该路径中的 data 部分丢失了。您应该将其更改为:

@Setting(settingPath = "data/es-config/elastic-analyzer.json")

或将 json 文件向上移动一个目录。

由于这个错误的路径,设置在创建时没有写入索引,因此分析器不可用 - 这会导致您看到错误消息。

另一件事:加载数据时,不要对每个实体对象调用 save,您应该将它们收集在列表中并使用 saveAll 进行批量插入,这样性能更高。