Logstash 无法理解的内存不足错误

Question

我有一个 Logstash 7.6.2 docker 由于内存泄漏而停止运行。每次管道执行后，Logstash 似乎都没有释放内存。 我应该如何确定问题的根源？我该如何解决？

欢迎任何帮助^^。

这是我在日志中看到的错误。这些只是 Traceback 的前 5 行。我将其余部分上传到 github there.

的文件中

logstash    | [2020-04-08T18:15:42,960][INFO ][logstash.outputs.file    ][rawweb] Closing file /output/web_data.json
 logstash    | [2020-04-08T18:15:43,353][ERROR][org.logstash.Logstash    ] java.lang.OutOfMemoryError: Java heap space
 logstash    | [2020-04-08T18:15:43,367][ERROR][org.logstash.execution.WorkerLoop][rawclient] Exception in pipelineworker, the pipeline stopped processing new events, please check your filter configuration and restart Logstash.
 logstash    | org.jruby.exceptions.NoMethodError: (NoMethodError) undefined method `pop' for nil:NilClass
 logstash    |   at usr.share.logstash.vendor.bundle.jruby._dot_5_dot_0.gems.awesome_print_minus_1_dot_7_dot_0.lib.awesome_print.inspector.awesome(/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/awesome_print-1.7.0/lib/awesome_print/inspector.rb:117) ~[?:?]

这里是docker-compose.yml 我用来配置我的Logstash Docker

version: '2.4'    
services:
      logstash:
        image: docker.elastic.co/logstash/logstash:7.6.2
        container_name: logstash
        environment:
          LS_JAVA_OPTS: "-Xmx7g -Xms4g"
          REQUEST_FREQUENCY: 600 #seconds
        volumes:
          - ./logstash/config/logstash.yml:/usr/share/logstash/config/logstash.yml:ro
          - ./logstash/pipelines.yml:/usr/share/logstash/config/pipelines.yml
          - ./logstash/pipeline:/usr/share/logstash/pipeline:ro
          - ./logstash/tests:/testscripts:ro
          - /root/logstash_output/:/output/
        ports:
          - "9600:9600"
        mem_limit: 7000M
        mem_reservation: 100M

我的pipelines.yml文件

- pipeline.id: rawclient
  path.config: "/usr/share/logstash/pipeline/logclient.conf"
  pipeline.batch.size: 10000000

 - pipeline.id: rawweb
   path.config: "/usr/share/logstash/pipeline/logweb.conf"
   pipeline.batch.size: 10000000

我的一个 .conf 文件。基本上，它执行包含 curl 请求的 .sh 脚本。这个请求的结果是管道的输入。进行治疗。然后将结果存储在文件中。两条管道做的一样，唯一的区别是发出的curl请求。

input {
  exec {
    command => "bash /testscripts/logclient_1.sh"
    codec => "json"
    interval => "600"
  }
}

filter {
    mutate {
        rename => ["connection/start_time", "start_time" ]
        rename => ["connection/end_time", "end_time" ]
        rename => ["connection/duration", "duration" ]
        rename => ["connection/destination_ip_address", "destination_ip_address" ]
        rename => ["connection/status", "status" ]
        rename => ["device/last_ip_address", "last_ip_address" ]
        rename => ["user/sid", "sid" ]
        # rename => ["binary/application_category", "application_category" ]
        rename => ["binary/application_name", "application_name" ]
        rename => ["binary/executable_name", "executable_name" ]
        remove_field => ["@timestamp"]
        remove_field => ["@version"]
        add_field => { "connection_type" => "client" }
    }
}
output {
  file {
   path => "/output/client_data.json"
   codec => "json"
 }
  stdout {
   codec => rubydebug
 }
}

我的logstash.yml文件

http.host: "0.0.0.0"
xpack.monitoring.enabled: false
Thanks for all the help :slightly_smiling_face:

Answer 1

您的流水线批次大小巨大。以下是文档 (https://www.elastic.co/guide/en/logstash/current/logstash-settings-file.html) 关于此设置的说明：

The maximum number of events an individual worker thread will collect from inputs before attempting to execute its filters and outputs. Larger batch sizes are generally more efficient, but come at the cost of increased memory overhead. You may need to increase JVM heap space in the jvm.options config file.

这意味着单个工作人员将收集 1000 万个事件然后开始处理它们。显然这1000万个事件要保存在内存中。此外，您还有一个额外的管道，其批处理大小相同，为 1000 万个事件。考虑到您只有 7 GB 的 RAM 分配给 Logstash，这已经很庞大了。另请注意，默认值为 125 个事件.

我建议减少管道的批量大小以修复 OutOfMemoryExceptions。

Logstash 无法理解的内存不足错误

Uncomprehensible out of Memory Error with Logstash

out-of-memory

logstash

docker