ELK 数据摄取:elasticsearch 将文本视为布尔值,即使映射表示类型为 'text'
ELK data ingestion: elasticsearch treat text as boolean even though mapping says type is 'text'
我正在使用 ELK 7.4.1 和来自 here 的 docker,我需要从 MySQL 数据库中提取数据。其中一个表将此 'status' 字段定义为 varchar(128)
。为此,我使用了 logstash jdbc 插件,但是当我启动 docker 图像时,我看到很多警告消息说 org.elasticsearch.index.mapper.MapperParsingException: failed to parse field [status] of type [boolean] in document with id '34ZXb24BsfR1FhttyYWt'. Preview of field's value: 'Success'"
。然而,令我困惑的是映射似乎是正确的:"status": { "type": "text", ... }
并且数据似乎已成功摄取。
我什至尝试手动创建索引,然后在摄取数据之前放置映射,但这也无济于事。
知道为什么吗?
添加更多信息:
Table定义
CREATE TABLE records (
id int(11) NOT NULL AUTO_INCREMENT,
...
status varchar(128) NOT NULL DEFAULT '',
...
)
Elasticsearch 映射
{
"properties": {
...
"status": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
...
}
数据示例
+-----------+-----------+
| id | status |
+-----------+-----------+
| 452172830 | success |
| 452172835 | other |
| 452172840 | success |
...
更多信息
Elasticsearch 映射模板
PUT /_template/records_template
{
"index_patterns": ["records"],
"mappings": {
"_source": {
"enabled": false
},
"properties": {
"status": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
Logstash 配置文件
input {
jdbc {
tags => "records"
jdbc_connection_string => "jdbc:mysql://10.0.2.15:3306/esd"
jdbc_user => "dbuser"
jdbc_password => "dbpass"
schedule => "* * * * *"
jdbc_validate_connection => true
jdbc_paging_enabled => true
jdbc_page_size => 100000
jdbc_driver_class => "com.mysql.cj.jdbc.Driver"
statement => "select * from records order by id asc"
}
...
}
output {
if "records" in [tags] {
elasticsearch {
hosts => "elasticsearch:9200"
user => "elastic"
password => "changeme"
index => "records"
template_name => "records_template"
document_id => "%{id}"
}
}
...
看起来字段名称在某种程度上很重要。如果我将 select 子句更改为 select status as rd_status, ...
之类的内容,那么所有错误都将消失。不确定我是否遗漏了 elasticsearch 映射或 logstash 内部会尝试通过名称 status
来猜测数据类型(如果是后一种情况,我会感到惊讶)
我正在使用 ELK 7.4.1 和来自 here 的 docker,我需要从 MySQL 数据库中提取数据。其中一个表将此 'status' 字段定义为 varchar(128)
。为此,我使用了 logstash jdbc 插件,但是当我启动 docker 图像时,我看到很多警告消息说 org.elasticsearch.index.mapper.MapperParsingException: failed to parse field [status] of type [boolean] in document with id '34ZXb24BsfR1FhttyYWt'. Preview of field's value: 'Success'"
。然而,令我困惑的是映射似乎是正确的:"status": { "type": "text", ... }
并且数据似乎已成功摄取。
我什至尝试手动创建索引,然后在摄取数据之前放置映射,但这也无济于事。
知道为什么吗?
添加更多信息:
Table定义
CREATE TABLE records (
id int(11) NOT NULL AUTO_INCREMENT,
...
status varchar(128) NOT NULL DEFAULT '',
...
)
Elasticsearch 映射
{
"properties": {
...
"status": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
...
}
数据示例
+-----------+-----------+
| id | status |
+-----------+-----------+
| 452172830 | success |
| 452172835 | other |
| 452172840 | success |
...
更多信息 Elasticsearch 映射模板
PUT /_template/records_template
{
"index_patterns": ["records"],
"mappings": {
"_source": {
"enabled": false
},
"properties": {
"status": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
Logstash 配置文件
input {
jdbc {
tags => "records"
jdbc_connection_string => "jdbc:mysql://10.0.2.15:3306/esd"
jdbc_user => "dbuser"
jdbc_password => "dbpass"
schedule => "* * * * *"
jdbc_validate_connection => true
jdbc_paging_enabled => true
jdbc_page_size => 100000
jdbc_driver_class => "com.mysql.cj.jdbc.Driver"
statement => "select * from records order by id asc"
}
...
}
output {
if "records" in [tags] {
elasticsearch {
hosts => "elasticsearch:9200"
user => "elastic"
password => "changeme"
index => "records"
template_name => "records_template"
document_id => "%{id}"
}
}
...
看起来字段名称在某种程度上很重要。如果我将 select 子句更改为 select status as rd_status, ...
之类的内容,那么所有错误都将消失。不确定我是否遗漏了 elasticsearch 映射或 logstash 内部会尝试通过名称 status
来猜测数据类型(如果是后一种情况,我会感到惊讶)