Telegraf `docker_log` 不发送所有消息

Telegraf `docker_log` does not send all messages

我想使用 telegraf ``inputs.docker_log.

从多个 go 服务中收集 influxdb 中的日志

我发现 influxdb 并不包含我在 docker service logs 中看到的所有记录。

我做了简单的 go 脚本。并做几个不同延迟的实验。

package main

import (
    "time"

    "github.com/labstack/gommon/log"
)

func main() {

    log.SetLevel(log.DEBUG)
    for i := 0; i < 5; i++ {
        writeSomething()
        <-time.After(10 * time.Second)
    }
}

func writeSomething() {
    delay := 50 * time.Millisecond
    log.Debug("A")
    <-time.After(delay)

    log.Debug("B")
    <-time.After(delay)

    log.Debug("C")
    <-time.After(delay)

}

我发现延迟 < 1 秒时出现问题。

例如。延迟 = 50 毫秒

Docker service logs 显示所有 3 个变体(A、B、C)5 次

liquineq_sender.1.0jvam7vg63cm@Ivan-Lenovo-ideapad    | {"time":"2019-09-26T13:47:17.115060311Z","level":"DEBUG","prefix":"-","file":"main.go","line":"20","message":"A"}
liquineq_sender.1.0jvam7vg63cm@Ivan-Lenovo-ideapad    | {"time":"2019-09-26T13:47:17.165383407Z","level":"DEBUG","prefix":"-","file":"main.go","line":"23","message":"B"}
liquineq_sender.1.0jvam7vg63cm@Ivan-Lenovo-ideapad    | {"time":"2019-09-26T13:47:17.21562549Z","level":"DEBUG","prefix":"-","file":"main.go","line":"26","message":"C"}

但我在 influxdb 中找不到其中一些:

> select message from docker_log;
name: docker_log
time                message
----                -------
1569505640000000000 2019-09-26T13:47:18Z I! Starting Telegraf 1.12.2
1569505640000000000 t=2019-09-26T13:47:12+0000 lvl=info msg="Starting Grafana" logger=server version=6.3.0-pre commit=unknown-dev branch=master compiled=2019-06-21T08:57:10+0000
1569505640000000000 {"time":"2019-09-26T13:47:17.115060311Z","level":"DEBUG","prefix":"-","file":"main.go","line":"20","message":"A"}
1569505647000000000 {"time":"2019-09-26T13:47:27.265995126Z","level":"DEBUG","prefix":"-","file":"main.go","line":"20","message":"A"}
1569505657000000000 {"time":"2019-09-26T13:47:37.417263994Z","level":"DEBUG","prefix":"-","file":"main.go","line":"20","message":"A"}
1569505658000000000 {"time":"2019-09-26T13:47:37.517861048Z","level":"DEBUG","prefix":"-","file":"main.go","line":"26","message":"C"}
1569505668000000000 {"time":"2019-09-26T13:47:47.568291482Z","level":"DEBUG","prefix":"-","file":"main.go","line":"20","message":"A"}
1569505678000000000 {"time":"2019-09-26T13:47:57.71926933Z","level":"DEBUG","prefix":"-","file":"main.go","line":"20","message":"A"}

如果将延迟增加到 1 秒,则 influxdb 包含所有记录。

有没有办法设置 telegraf / influxdb 也解析毫秒?

我通过向 telegraf 配置添加以下设置解决了一个问题。

[agent]
  precision = "100ns"

解释:
telegraf 的日志累加器有方法getTime。该方法用一些 precision 舍入日志的时间。

来自 AgentConfig:

// By default or when set to "0s", precision will be set to the same
// timestamp order as the collection interval, with the maximum being 1s.
//   ie, when interval = "10s", precision will be "1s"
//       when interval = "250ms", precision will be "1ms"
// Precision will NOT be used for service inputs. It is up to each individual
// service input to set the timestamp at the appropriate precision.