尝试在 Elasticsearch 中设置 max_gram 和 min_gram
Trying to set the max_gram and min_gram in Elasticsearch
我正在尝试在 Ubuntu 16.04 EC2 服务器上的 Rails 应用程序上部署 Ruby,但给出了关于 max_gram 和 [=24] 之间差异的错误=] 在 Elasticsearch 上,我对 Elasticsearch 没有任何经验,所以我完全迷失在这里,我需要一些指导来做到这一点,并学习如何设置它以避免将来出现此问题。
我第一次部署时出现拒绝连接 localhost:9200 的错误,所以我不得不检查服务是否 运行,甚至检查防火墙,最后我有在 elasticsearch.yml 上进行全新安装并配置所有内容,现在 运行 并且可以正常工作,但是当我尝试再次部署时出现错误,在互联网上进行了大量搜索,有很多文档但我仍然不知道在哪里设置这些值。
这是我在日志中收到的错误:
-----> Migrating database...
rake aborted!
StandardError: An error has occurred, all later migrations canceled:
[400] {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"The difference between max_gram and min_gram in NGram Tokenizer must be less than or equal to: [1] but was [49]. This limit can be set by changing the [index.max_ngram_diff] index level setting."}],"type":"illegal_argument_exception","reason":"The difference between max_gram and min_gram in NGram Tokenizer must be less than or equal to: [1] but was [49]. This limit can be set by changing the [index.max_ngram_diff] index level setting."},"status":400}
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/elasticsearch-transport-6.0.2/lib/elasticsearch/transport/transport/base.rb:205:in `__raise_transport_error'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/elasticsearch-transport-6.0.2/lib/elasticsearch/transport/transport/base.rb:323:in `perform_request'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/elasticsearch-transport-6.0.2/lib/elasticsearch/transport/transport/http/faraday.rb:20:in `perform_request'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/elasticsearch-transport-6.0.2/lib/elasticsearch/transport/client.rb:131:in `perform_request'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/elasticsearch-api-6.0.2/lib/elasticsearch/api/namespace/common.rb:21:in `perform_request'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/elasticsearch-api-6.0.2/lib/elasticsearch/api/actions/indices/create.rb:86:in `create'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/searchkick-3.0.2/lib/searchkick/index.rb:16:in `create'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/searchkick-3.0.2/lib/searchkick/index.rb:203:in `create_index'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/searchkick-3.0.2/lib/searchkick/index.rb:270:in `reindex_scope'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/searchkick-3.0.2/lib/searchkick/index.rb:196:in `reindex'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/searchkick-3.0.2/lib/searchkick/model.rb:59:in `searchkick_reindex'
/home/deploy/catalogindustry/releases/20190807135404/db/migrate/20180405153226_validated_true.rb:4:in `change'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:789:in `exec_migration'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:773:in `block (2 levels) in migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:772:in `block in migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/connection_adapters/abstract/connection_pool.rb:398:in `with_connection'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:771:in `migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:951:in `migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1232:in `block in execute_migration_in_transaction'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1302:in `ddl_transaction'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1231:in `execute_migration_in_transaction'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1203:in `block in migrate_without_lock'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1202:in `each'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1202:in `migrate_without_lock'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1150:in `block in migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1319:in `with_advisory_lock'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1150:in `migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1006:in `up'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:984:in `migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/tasks/database_tasks.rb:163:in `migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/railties/databases.rake:58:in `block (2 levels) in '
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/rake-12.3.1/exe/rake:27:in `'
/home/deploy/.rbenv/versions/2.3.1/bin/bundle:23:in `load'
/home/deploy/.rbenv/versions/2.3.1/bin/bundle:23:in `
elasticsearch 上没有索引文件,默认模板上也没有这个设置
我遇到过类似的问题,下面的错误消息清楚地解释了这个问题。
[400]
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"The
difference between max_gram and min_gram in NGram Tokenizer must be
less than or equal to: 1 but was [49]. This limit can be set by
changing the [index.max_ngram_diff] index level
setting."}],"type":"illegal_argument_exception","reason":"The
difference between max_gram and min_gram in NGram Tokenizer must be
less than or equal to: 1 but was [49]. This limit can be set by
changing the [index.max_ngram_diff] index level
setting."},"status":400}
基本上,默认情况下,NGram Tokenizer 中 max_gram 和 min_gram 之间的差异不能超过 1,如果你想改变它,那么在你的索引设置中你需要通过添加以下设置来更改它。
"max_ngram_diff" : "50" --> you can mention this number accoding to your requirement.
下面是我的索引设置,您可以在其中看到我的 max_gram
和 min_gram
有 47
的差异,因此将 max_ngram_diff
设置为 50
.
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"prefix": {
"type": "custom",
"filter": [
"lowercaseFilter"
],
"tokenizer": "edgeNGramTokenizer"
}
},
"tokenizer": {
"edgeNGramTokenizer": {
"token_chars": [
"letter",
"digit"
],
"min_gram": "1",
"type": "edgeNGram",
"max_gram": "40"
},
"loginNGram": {
"type": "nGram",
"min_gram": "3",
"max_gram": "50"
}
}
},
"number_of_shards": "1",
"number_of_replicas": "0",
"max_ngram_diff" : "50"
}
}
}
编辑:添加official Elastic文档,解释了max_gram默认长度为2,min_gram默认长度为1,因此它们之间的默认差异不能超过 1,因此例外。然后来自同一文档的片段
The index level setting index.max_ngram_diff controls the maximum
allowed difference between max_gram and min_gram.
也可以使用索引模板将设置自动应用到所有新索引:
curl -X PUT "localhost:9200/_index_template/template_1?pretty" -H 'Content-Type: application/json' -d'
{
"index_patterns": [
"*"
],
"template": {
"settings": {
"index": {
"max_ngram_diff": 50
}
}
}
}
'
删除所有索引不会删除模板,但必须手动删除:
curl -X DELETE "localhost:9200/_index_template/template_1
我正在尝试在 Ubuntu 16.04 EC2 服务器上的 Rails 应用程序上部署 Ruby,但给出了关于 max_gram 和 [=24] 之间差异的错误=] 在 Elasticsearch 上,我对 Elasticsearch 没有任何经验,所以我完全迷失在这里,我需要一些指导来做到这一点,并学习如何设置它以避免将来出现此问题。
我第一次部署时出现拒绝连接 localhost:9200 的错误,所以我不得不检查服务是否 运行,甚至检查防火墙,最后我有在 elasticsearch.yml 上进行全新安装并配置所有内容,现在 运行 并且可以正常工作,但是当我尝试再次部署时出现错误,在互联网上进行了大量搜索,有很多文档但我仍然不知道在哪里设置这些值。
这是我在日志中收到的错误:
-----> Migrating database...
rake aborted!
StandardError: An error has occurred, all later migrations canceled:
[400] {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"The difference between max_gram and min_gram in NGram Tokenizer must be less than or equal to: [1] but was [49]. This limit can be set by changing the [index.max_ngram_diff] index level setting."}],"type":"illegal_argument_exception","reason":"The difference between max_gram and min_gram in NGram Tokenizer must be less than or equal to: [1] but was [49]. This limit can be set by changing the [index.max_ngram_diff] index level setting."},"status":400}
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/elasticsearch-transport-6.0.2/lib/elasticsearch/transport/transport/base.rb:205:in `__raise_transport_error'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/elasticsearch-transport-6.0.2/lib/elasticsearch/transport/transport/base.rb:323:in `perform_request'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/elasticsearch-transport-6.0.2/lib/elasticsearch/transport/transport/http/faraday.rb:20:in `perform_request'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/elasticsearch-transport-6.0.2/lib/elasticsearch/transport/client.rb:131:in `perform_request'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/elasticsearch-api-6.0.2/lib/elasticsearch/api/namespace/common.rb:21:in `perform_request'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/elasticsearch-api-6.0.2/lib/elasticsearch/api/actions/indices/create.rb:86:in `create'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/searchkick-3.0.2/lib/searchkick/index.rb:16:in `create'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/searchkick-3.0.2/lib/searchkick/index.rb:203:in `create_index'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/searchkick-3.0.2/lib/searchkick/index.rb:270:in `reindex_scope'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/searchkick-3.0.2/lib/searchkick/index.rb:196:in `reindex'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/searchkick-3.0.2/lib/searchkick/model.rb:59:in `searchkick_reindex'
/home/deploy/catalogindustry/releases/20190807135404/db/migrate/20180405153226_validated_true.rb:4:in `change'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:789:in `exec_migration'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:773:in `block (2 levels) in migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:772:in `block in migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/connection_adapters/abstract/connection_pool.rb:398:in `with_connection'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:771:in `migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:951:in `migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1232:in `block in execute_migration_in_transaction'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1302:in `ddl_transaction'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1231:in `execute_migration_in_transaction'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1203:in `block in migrate_without_lock'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1202:in `each'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1202:in `migrate_without_lock'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1150:in `block in migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1319:in `with_advisory_lock'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1150:in `migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:1006:in `up'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/migration.rb:984:in `migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/tasks/database_tasks.rb:163:in `migrate'
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/activerecord-5.0.7/lib/active_record/railties/databases.rake:58:in `block (2 levels) in '
/home/deploy/catalogindustry/shared/bundle/ruby/2.3.0/gems/rake-12.3.1/exe/rake:27:in `'
/home/deploy/.rbenv/versions/2.3.1/bin/bundle:23:in `load'
/home/deploy/.rbenv/versions/2.3.1/bin/bundle:23:in `
elasticsearch 上没有索引文件,默认模板上也没有这个设置
我遇到过类似的问题,下面的错误消息清楚地解释了这个问题。
[400] {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"The difference between max_gram and min_gram in NGram Tokenizer must be less than or equal to: 1 but was [49]. This limit can be set by changing the [index.max_ngram_diff] index level setting."}],"type":"illegal_argument_exception","reason":"The difference between max_gram and min_gram in NGram Tokenizer must be less than or equal to: 1 but was [49]. This limit can be set by changing the [index.max_ngram_diff] index level setting."},"status":400}
基本上,默认情况下,NGram Tokenizer 中 max_gram 和 min_gram 之间的差异不能超过 1,如果你想改变它,那么在你的索引设置中你需要通过添加以下设置来更改它。
"max_ngram_diff" : "50" --> you can mention this number accoding to your requirement.
下面是我的索引设置,您可以在其中看到我的 max_gram
和 min_gram
有 47
的差异,因此将 max_ngram_diff
设置为 50
.
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"prefix": {
"type": "custom",
"filter": [
"lowercaseFilter"
],
"tokenizer": "edgeNGramTokenizer"
}
},
"tokenizer": {
"edgeNGramTokenizer": {
"token_chars": [
"letter",
"digit"
],
"min_gram": "1",
"type": "edgeNGram",
"max_gram": "40"
},
"loginNGram": {
"type": "nGram",
"min_gram": "3",
"max_gram": "50"
}
}
},
"number_of_shards": "1",
"number_of_replicas": "0",
"max_ngram_diff" : "50"
}
}
}
编辑:添加official Elastic文档,解释了max_gram默认长度为2,min_gram默认长度为1,因此它们之间的默认差异不能超过 1,因此例外。然后来自同一文档的片段
The index level setting index.max_ngram_diff controls the maximum allowed difference between max_gram and min_gram.
也可以使用索引模板将设置自动应用到所有新索引:
curl -X PUT "localhost:9200/_index_template/template_1?pretty" -H 'Content-Type: application/json' -d'
{
"index_patterns": [
"*"
],
"template": {
"settings": {
"index": {
"max_ngram_diff": 50
}
}
}
}
'
删除所有索引不会删除模板,但必须手动删除:
curl -X DELETE "localhost:9200/_index_template/template_1