Elasticsearch Filebeat 忽略自定义索引模板并使用默认 filebeat 索引模板覆盖输出索引的映射
Elasticsearch Filebeat ignores custom index template and overwrites output index's mapping with default filebeat index template
What are you trying to do?
使用 Filebeat 从 JSON files in ndjson format
中获取输入数据作为 filestream
并将它们插入 Elasticsearch 中的 my_index
没有额外的键。
Show me your configs.
Elasticsearch.yml
# ---------------------------------- Cluster -----------------------------------
#
cluster.name: masterCluster
#
# ------------------------------------ Node ------------------------------------
#
node.name: masterNode
#
#----------------------- BEGIN SECURITY AUTO CONFIGURATION -----------------------
# Security features
xpack.security.enabled: false
xpack.security.enrollment.enabled: false
xpack.security.http.ssl.enabled: false
xpack.security.transport.ssl.enabled: false
#----------------------- END SECURITY AUTO CONFIGURATION -------------------------
Filebeat.yml
# ============================== Filebeat inputs ===============================
filebeat.inputs:
- type: filestream
enabled: true
paths:
- /home/asura/EBK/data/*.json
parser:
- ndjson:
keys_under_root: true
add_error_key: true
# ======================= Elasticsearch template setting =======================
setup.ilm.enabled: false
setup.template:
name: "my_index_template"
pattern: "my_index*"
# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
hosts: ["localhost:9200"]
index: "my_index"
What do my_index
and my_index_template
look like?
Kibana 中 my_index 的映射:
{
"mappings": {}
}
Kibana 中 my_index_template 的预览:
{
"template": {
"settings": {
"index": {
"routing": {
"allocation": {
"include": {
"_tier_preference": "data_content"
}
}
}
}
},
"aliases": {},
"mappings": {}
}
}
What does your input file look like?
input.json
{"filename" :"16.avi", "frame": 131, "Class":"person", "confidence":32, "Date & Time" :"Thu Oct 3 14:02:41 2019", "Others" :"Blue"}
{"filename" :"16.avi", "frame": 131, "Class":"person", "confidence":36, "Date & Time" :"Thu Oct 3 14:02:41 2019", "Others" :"Grey,Blue"}
我将上面的文件拖放到监视文件夹中,插入就可以了。
What does the data look like after inserting into Elasticsearch?
GET 请求:http://<host>:<my_port>/my_index/_search?filter_path=hits.hits._source
回复:
{
"hits": {
"hits": [
{
"_source": {
"@timestamp": "2022-04-21T21:49:04.084Z",
"log": {
"offset": 0,
"file": {
"path": "/home/asura/EBK/data/input.json"
}
},
"frame": 131,
"Class": "person",
"input": {
"type": "filestream"
},
"ecs": {
"version": "8.0.0"
},
"host": {
"name": "pisacha"
},
"agent": {
"ephemeral_id": "d389a35d-40f7-4680-a485-8e6939d011ab",
"id": "c6cb1ce5-ff92-499d-9e3c-e79478795fca",
"name": "pisacha",
"type": "filebeat",
"version": "8.1.3"
},
"Date & Time": "Thu Oct 3 14:02:41 2019",
"Others": "Blue",
"filename": "16.avi",
"confidence": 32
}
},
{
"_source": {
"@timestamp": "2022-04-21T21:49:04.084Z",
"agent": {
"type": "filebeat",
"version": "8.1.3",
"ephemeral_id": "d389a35d-40f7-4680-a485-8e6939d011ab",
"id": "c6cb1ce5-ff92-499d-9e3c-e79478795fca",
"name": "pisacha"
},
"Others": "Grey,Blue",
"filename": "16.avi",
"input": {
"type": "filestream"
},
"frame": 131,
"Class": "person",
"ecs": {
"version": "8.0.0"
},
"host": {
"name": "pisacha"
},
"confidence": 36,
"log": {
"offset": 133,
"file": {
"path": "/home/asura/EBK/data/input.json"
}
},
"Date & Time": "Thu Oct 3 14:02:41 2019"
}
},
{
"_source": {
"@timestamp": "2022-04-21T21:49:04.084Z",
"input": {
"type": "filestream"
},
"agent": {
"id": "c6cb1ce5-ff92-499d-9e3c-e79478795fca",
"name": "pisacha",
"type": "filebeat",
"version": "8.1.3",
"ephemeral_id": "d389a35d-40f7-4680-a485-8e6939d011ab"
},
"ecs": {
"version": "8.0.0"
},
"host": {
"name": "pisacha"
},
"message": "",
"error": {
"type": "json",
"message": "Error decoding JSON: EOF"
}
}
}
]
}
}
它没有使用我指定的模板。
而且令人惊讶的是:
Filebeat 插入数据后 my_index
在 Kibana 中的预览:
{
"mappings": {
"properties": {
"@timestamp": {
"type": "date"
},
"Class": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"Date & Time": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"Others": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"agent": {
"properties": {
"ephemeral_id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"version": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"confidence": {
"type": "long"
},
"ecs": {
"properties": {
"version": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"error": {
"properties": {
"message": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"filename": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"frame": {
"type": "long"
},
"host": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"input": {
"properties": {
"type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"log": {
"properties": {
"file": {
"properties": {
"path": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"offset": {
"type": "long"
}
}
},
"message": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
my_index_template
中的映射非常庞大,长达数万行。几乎就像它拥有 fields.yml
拥有的所有字段一样。
它还默认为它创建了一个名为 my_index
的 data_stream
。
即使在设置 setup.ilm.enabled: false
之后,数据仍然会被插入,所有字段都显示在 filebeat 默认索引模板中。我已经搜索并尝试了所有可能的方法,我需要一些不是在黑暗中射击的人的指导。
用于 Elasticsearch、Kibana 和 Filebeat 的版本:8.1.3
如果您需要更多信息,请发表评论:)
参考文献:
- 正在解析 ndjson:https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-filestream.html#_parsers
- 使用自定义索引:https://www.elastic.co/guide/en/beats/filebeat/current/elasticsearch-output.html#index-option-es
- 使用自定义模板:https://www.elastic.co/guide/en/beats/filebeat/current/configuration-template.html
- 对于过滤后的响应:https://www.elastic.co/guide/en/elasticsearch/reference/current/common-options.html#common-options-response-filtering
TLDR;
我不确定是否有停止 Filebeat
添加这些字段的选项。
但您可以在输出中添加 filter processor 以删除它们。
# ============================== Filebeat inputs ===============================
filebeat.inputs:
- type: filestream
enabled: true
paths:
- /home/asura/EBK/data/*.json
parser:
- ndjson:
keys_under_root: true
add_error_key: true
# ======================= Elasticsearch template setting =======================
setup.ilm.enabled: false
setup.template:
name: "my_index_template"
pattern: "my_index*"
# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
hosts: ["localhost:9200"]
index: "my_index"
processors:
- drop_fields:
fields: ["agent", "ecs", "host", ...]
如果存在完全禁用 Beats
以首先添加一些字段的选项,那将是更好的选择。我只是不知道。
编辑:
完整的工作解决方案涉及 Globally Declared Processors
。
filebeat.inputs:
- type: filestream
# Input Processors act during input stage of processing pipeline
processors:
- drop_fields:
fields: ["key1","key2"]
# ---------------------------- Global Processors ------------------
# Global processors for fields that are added later by filebeat
processors:
- drop_fields:
fields: ["agent", "ecs", "input", "log", "host"]
参考:
https://discuss.elastic.co/t/filebeat-didnt-drop-some-of-the-fields-like-agent-ecs-etc/243911/2
What are you trying to do?
使用 Filebeat 从 JSON files in ndjson format
中获取输入数据作为 filestream
并将它们插入 Elasticsearch 中的 my_index
没有额外的键。
Show me your configs.
Elasticsearch.yml
# ---------------------------------- Cluster -----------------------------------
#
cluster.name: masterCluster
#
# ------------------------------------ Node ------------------------------------
#
node.name: masterNode
#
#----------------------- BEGIN SECURITY AUTO CONFIGURATION -----------------------
# Security features
xpack.security.enabled: false
xpack.security.enrollment.enabled: false
xpack.security.http.ssl.enabled: false
xpack.security.transport.ssl.enabled: false
#----------------------- END SECURITY AUTO CONFIGURATION -------------------------
Filebeat.yml
# ============================== Filebeat inputs ===============================
filebeat.inputs:
- type: filestream
enabled: true
paths:
- /home/asura/EBK/data/*.json
parser:
- ndjson:
keys_under_root: true
add_error_key: true
# ======================= Elasticsearch template setting =======================
setup.ilm.enabled: false
setup.template:
name: "my_index_template"
pattern: "my_index*"
# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
hosts: ["localhost:9200"]
index: "my_index"
What do
my_index
andmy_index_template
look like?
Kibana 中 my_index 的映射:
{
"mappings": {}
}
Kibana 中 my_index_template 的预览:
{
"template": {
"settings": {
"index": {
"routing": {
"allocation": {
"include": {
"_tier_preference": "data_content"
}
}
}
}
},
"aliases": {},
"mappings": {}
}
}
What does your input file look like?
input.json
{"filename" :"16.avi", "frame": 131, "Class":"person", "confidence":32, "Date & Time" :"Thu Oct 3 14:02:41 2019", "Others" :"Blue"}
{"filename" :"16.avi", "frame": 131, "Class":"person", "confidence":36, "Date & Time" :"Thu Oct 3 14:02:41 2019", "Others" :"Grey,Blue"}
我将上面的文件拖放到监视文件夹中,插入就可以了。
What does the data look like after inserting into Elasticsearch?
GET 请求:http://<host>:<my_port>/my_index/_search?filter_path=hits.hits._source
回复:
{
"hits": {
"hits": [
{
"_source": {
"@timestamp": "2022-04-21T21:49:04.084Z",
"log": {
"offset": 0,
"file": {
"path": "/home/asura/EBK/data/input.json"
}
},
"frame": 131,
"Class": "person",
"input": {
"type": "filestream"
},
"ecs": {
"version": "8.0.0"
},
"host": {
"name": "pisacha"
},
"agent": {
"ephemeral_id": "d389a35d-40f7-4680-a485-8e6939d011ab",
"id": "c6cb1ce5-ff92-499d-9e3c-e79478795fca",
"name": "pisacha",
"type": "filebeat",
"version": "8.1.3"
},
"Date & Time": "Thu Oct 3 14:02:41 2019",
"Others": "Blue",
"filename": "16.avi",
"confidence": 32
}
},
{
"_source": {
"@timestamp": "2022-04-21T21:49:04.084Z",
"agent": {
"type": "filebeat",
"version": "8.1.3",
"ephemeral_id": "d389a35d-40f7-4680-a485-8e6939d011ab",
"id": "c6cb1ce5-ff92-499d-9e3c-e79478795fca",
"name": "pisacha"
},
"Others": "Grey,Blue",
"filename": "16.avi",
"input": {
"type": "filestream"
},
"frame": 131,
"Class": "person",
"ecs": {
"version": "8.0.0"
},
"host": {
"name": "pisacha"
},
"confidence": 36,
"log": {
"offset": 133,
"file": {
"path": "/home/asura/EBK/data/input.json"
}
},
"Date & Time": "Thu Oct 3 14:02:41 2019"
}
},
{
"_source": {
"@timestamp": "2022-04-21T21:49:04.084Z",
"input": {
"type": "filestream"
},
"agent": {
"id": "c6cb1ce5-ff92-499d-9e3c-e79478795fca",
"name": "pisacha",
"type": "filebeat",
"version": "8.1.3",
"ephemeral_id": "d389a35d-40f7-4680-a485-8e6939d011ab"
},
"ecs": {
"version": "8.0.0"
},
"host": {
"name": "pisacha"
},
"message": "",
"error": {
"type": "json",
"message": "Error decoding JSON: EOF"
}
}
}
]
}
}
它没有使用我指定的模板。
而且令人惊讶的是:
Filebeat 插入数据后 my_index
在 Kibana 中的预览:
{
"mappings": {
"properties": {
"@timestamp": {
"type": "date"
},
"Class": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"Date & Time": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"Others": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"agent": {
"properties": {
"ephemeral_id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"version": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"confidence": {
"type": "long"
},
"ecs": {
"properties": {
"version": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"error": {
"properties": {
"message": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"filename": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"frame": {
"type": "long"
},
"host": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"input": {
"properties": {
"type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"log": {
"properties": {
"file": {
"properties": {
"path": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"offset": {
"type": "long"
}
}
},
"message": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
my_index_template
中的映射非常庞大,长达数万行。几乎就像它拥有 fields.yml
拥有的所有字段一样。
它还默认为它创建了一个名为 my_index
的 data_stream
。
即使在设置 setup.ilm.enabled: false
之后,数据仍然会被插入,所有字段都显示在 filebeat 默认索引模板中。我已经搜索并尝试了所有可能的方法,我需要一些不是在黑暗中射击的人的指导。
用于 Elasticsearch、Kibana 和 Filebeat 的版本:8.1.3
如果您需要更多信息,请发表评论:)
参考文献:
- 正在解析 ndjson:https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-filestream.html#_parsers
- 使用自定义索引:https://www.elastic.co/guide/en/beats/filebeat/current/elasticsearch-output.html#index-option-es
- 使用自定义模板:https://www.elastic.co/guide/en/beats/filebeat/current/configuration-template.html
- 对于过滤后的响应:https://www.elastic.co/guide/en/elasticsearch/reference/current/common-options.html#common-options-response-filtering
TLDR;
我不确定是否有停止 Filebeat
添加这些字段的选项。
但您可以在输出中添加 filter processor 以删除它们。
# ============================== Filebeat inputs ===============================
filebeat.inputs:
- type: filestream
enabled: true
paths:
- /home/asura/EBK/data/*.json
parser:
- ndjson:
keys_under_root: true
add_error_key: true
# ======================= Elasticsearch template setting =======================
setup.ilm.enabled: false
setup.template:
name: "my_index_template"
pattern: "my_index*"
# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
hosts: ["localhost:9200"]
index: "my_index"
processors:
- drop_fields:
fields: ["agent", "ecs", "host", ...]
如果存在完全禁用 Beats
以首先添加一些字段的选项,那将是更好的选择。我只是不知道。
编辑:
完整的工作解决方案涉及 Globally Declared Processors
。
filebeat.inputs:
- type: filestream
# Input Processors act during input stage of processing pipeline
processors:
- drop_fields:
fields: ["key1","key2"]
# ---------------------------- Global Processors ------------------
# Global processors for fields that are added later by filebeat
processors:
- drop_fields:
fields: ["agent", "ecs", "input", "log", "host"]
参考:
https://discuss.elastic.co/t/filebeat-didnt-drop-some-of-the-fields-like-agent-ecs-etc/243911/2