无法使用 Logstash 解析 CSV 文件
Cannot parse CSV file with Logstash
我在使用 logstash 将 CSV 文件导入 ElasticSearch 以便在 Kibana 中进一步处理时遇到问题。
这是我的 logstash 配置文件:
input {
file {
path => ["/absolute_path_to_file/export.csv"]
start_position => beginning
ignore_older => 0
sincedb_path => "/dev/null"
}
}
#filter {
# csv {
# columns => [
# "id",
# "cislo_smlouvy",
# "zdroj",
# "produkt",
# "sjednani",
# "datum_odeslani",
# "cas_odeslani",
# "pojistovna",
# "tarif",
# "pojistnik",
# "telefon",
# "predmet_pojisteni",
# "rz",
# "pocatek_pojisteni",
# "rocni_pojistne",
# "urgence",
# "stav"
# ]
# separator => ";"
# remove_field => ["message"]
# }
#}
output {
# elasticsearch {
# hosts => "localhost:9200"
# index => "smlouvy"
# }
stdout {
codec => rubydebug
}
}
以及我的 CSV 文件的摘录:
"id";"číslo smlouvy";"zdroj";"produkt";"sjednání";"datum odeslaní";"čas odeslání";"pojišťovna";"tarif";"pojistník";"pojistnik telefon";"předmět pojištění";"rz";"počátek";"roční pojistné";"urgence";"stav"
"114951";"6132681255";"SRO";"POV";;"1.6.2016";"12:28";"csob";"csob-2";"BB TEST";"721666333";"Škoda Favorit";"NENÍ";"2.6.2016 00:00";"4657,00";;"TEST"
"114950";;"POV";"POV";"VO Bukvicova";"1.6.2016";"12:16";"csob";"csob-2";"BB BB";"721000111";"BMW X3";"NENÍ";"3.6.2016 00:00";"5550,00";;"TEST"
我正在调用这个命令:
sudo logstash -f /absolute_path_to_file/logstash.conf --vebrose
具有以下输出:
starting agent {:level=>:info}
starting pipeline {:id=>"main", :level=>:info}
Settings: Default pipeline workers: 2
Registering file input {:path=>["/absolute_path_to_file/export.csv"], :level=>:info}
Starting pipeline {:id=>"main", :pipeline_workers=>2, :batch_size=>125, :batch_delay=>5, :max_inflight=>250, :level=>:info}
Pipeline main started
一段时间无所事事后,我将其关闭:
^CSIGINT received. Shutting down the agent. {:level=>:warn}
stopping pipeline {:id=>"main"}
Closing inputs {:level=>:info}
Closed inputs {:level=>:info}
Input plugins stopped! Will shutdown filter/output workers. {:level=>:info}
Pipeline main has been shutdown
可能相关的版本信息:
logstash 2.3.2
logstash-input-file (2.2.5)
logstash-filter-csv (2.1.3)
logstash-output-elasticsearch (2.6.2)
logstash-output-stdout (2.0.6)
logstash-codec-rubydebug (2.0.7)
我已经阅读了我能找到的所有文档,并尝试从 GitHub 中复制大量 logstash.conf 示例,但没有成功。对我所缺少的有什么帮助吗?
所以我终于找到了问题所在。它与输入的 CSV 文件一起使用。
CSV 文件有 \r
换行,logstash 默认为 \n
。
顺便说一句:您不能将 \r
设置为 logstash 文件输入过滤器配置中的分隔符,因此我不得不将 CSV 文件转换为 \n
我在使用 logstash 将 CSV 文件导入 ElasticSearch 以便在 Kibana 中进一步处理时遇到问题。
这是我的 logstash 配置文件:
input {
file {
path => ["/absolute_path_to_file/export.csv"]
start_position => beginning
ignore_older => 0
sincedb_path => "/dev/null"
}
}
#filter {
# csv {
# columns => [
# "id",
# "cislo_smlouvy",
# "zdroj",
# "produkt",
# "sjednani",
# "datum_odeslani",
# "cas_odeslani",
# "pojistovna",
# "tarif",
# "pojistnik",
# "telefon",
# "predmet_pojisteni",
# "rz",
# "pocatek_pojisteni",
# "rocni_pojistne",
# "urgence",
# "stav"
# ]
# separator => ";"
# remove_field => ["message"]
# }
#}
output {
# elasticsearch {
# hosts => "localhost:9200"
# index => "smlouvy"
# }
stdout {
codec => rubydebug
}
}
以及我的 CSV 文件的摘录:
"id";"číslo smlouvy";"zdroj";"produkt";"sjednání";"datum odeslaní";"čas odeslání";"pojišťovna";"tarif";"pojistník";"pojistnik telefon";"předmět pojištění";"rz";"počátek";"roční pojistné";"urgence";"stav"
"114951";"6132681255";"SRO";"POV";;"1.6.2016";"12:28";"csob";"csob-2";"BB TEST";"721666333";"Škoda Favorit";"NENÍ";"2.6.2016 00:00";"4657,00";;"TEST"
"114950";;"POV";"POV";"VO Bukvicova";"1.6.2016";"12:16";"csob";"csob-2";"BB BB";"721000111";"BMW X3";"NENÍ";"3.6.2016 00:00";"5550,00";;"TEST"
我正在调用这个命令:
sudo logstash -f /absolute_path_to_file/logstash.conf --vebrose
具有以下输出:
starting agent {:level=>:info}
starting pipeline {:id=>"main", :level=>:info}
Settings: Default pipeline workers: 2
Registering file input {:path=>["/absolute_path_to_file/export.csv"], :level=>:info}
Starting pipeline {:id=>"main", :pipeline_workers=>2, :batch_size=>125, :batch_delay=>5, :max_inflight=>250, :level=>:info}
Pipeline main started
一段时间无所事事后,我将其关闭:
^CSIGINT received. Shutting down the agent. {:level=>:warn}
stopping pipeline {:id=>"main"}
Closing inputs {:level=>:info}
Closed inputs {:level=>:info}
Input plugins stopped! Will shutdown filter/output workers. {:level=>:info}
Pipeline main has been shutdown
可能相关的版本信息:
logstash 2.3.2
logstash-input-file (2.2.5)
logstash-filter-csv (2.1.3)
logstash-output-elasticsearch (2.6.2)
logstash-output-stdout (2.0.6)
logstash-codec-rubydebug (2.0.7)
我已经阅读了我能找到的所有文档,并尝试从 GitHub 中复制大量 logstash.conf 示例,但没有成功。对我所缺少的有什么帮助吗?
所以我终于找到了问题所在。它与输入的 CSV 文件一起使用。
CSV 文件有 \r
换行,logstash 默认为 \n
。
顺便说一句:您不能将 \r
设置为 logstash 文件输入过滤器配置中的分隔符,因此我不得不将 CSV 文件转换为 \n