如何监控Heron Cluster的吞吐量

How to monitor the throughput of Heron Cluster

出于某些原因,我需要获取 Heron 集群的吞吐量,但 Heron 中没有指标 UI。那么你对如何监控Heron Cluster的吞吐量有什么想法吗?谢谢

运行 heron-explorer 结果如下:

yitian@heron01:~$ heron-explorer metrics aurora/yitian/devel SentenceWordCountTopology
[2018-08-03 21:02:09 +0000] [INFO]: Using tracker URL: http://127.0.0.1:8888
'spout' metrics:
container id           jvm-uptime-secs    jvm-process-cpu-load    jvm-memory-used-mb    emit-count    ack-count    fail-count
-------------------  -----------------  ----------------------  --------------------  ------------  -----------  ------------
container_3_spout_6               2053                0.253257                 146     1.13288e+07  1.13278e+07             0
container_4_spout_7               2091                0.150625                 137.5   1.1624e+07   1.16228e+07           231

'count' metrics:
container id            jvm-uptime-secs    jvm-process-cpu-load    jvm-memory-used-mb    emit-count    execute-count    ack-count    fail-count
--------------------  -----------------  ----------------------  --------------------  ------------  ---------------  -----------  ------------
container_6_count_12               2092                0.184742               155.167             0      4.6026e+07   4.6026e+07              0
container_5_count_9                2091                0.387867               146                 0      4.60069e+07  4.60069e+07             0
container_6_count_11               2092                0.184488               157.833             0      4.58158e+07  4.58158e+07             0
container_4_count_8                2091                0.443688               129.833             0      4.58722e+07  4.58722e+07             0
container_5_count_10               2091                0.382577               118.5               0      4.60091e+07  4.60091e+07             0

'split' metrics:
container id           jvm-uptime-secs    jvm-process-cpu-load    jvm-memory-used-mb    emit-count    execute-count    ack-count    fail-count
-------------------  -----------------  ----------------------  --------------------  ------------  ---------------  -----------  ------------
container_1_split_2               2091                0.143034               75.3333   4.59453e+07      4.59453e+06  4.59453e+06             0
container_3_split_5               2042                1.12248                79.1667   4.64862e+07      4.64862e+06  4.64862e+06             0
container_2_split_3               2150                0.139837               83.6667   4.59443e+07      4.59443e+06  4.59443e+06             0
container_1_split_1               2091                0.145702              104.167    4.59454e+07      4.59454e+06  4.59454e+06             0
container_2_split_4               2150                0.138453              106.333    4.59443e+07      4.59443e+06  4.59443e+06             0
[2018-08-03 21:02:09 +0000] [INFO]: Elapsed time: 0.031s.

您可以使用接收器组件的 execute-count 来测量拓扑的输出。如果您的每个组件都有 1:1 input:output 比率,那么这就是您的吞吐量。

但是,如果您将元组窗口化为批次或拆分元组(例如将句子分成单个单词),那么事情会变得有点复杂。您可以通过查看 spout 组件的 emit-count 将输入输入到拓扑中。然后,您可以使用它与螺栓 execute-counts 相比较来创建您自己的吞吐量指标。

以编程方式访问这些指标的一种简单方法是通过 Heron Tracker REST API. You can use your chosen language's HTTP library (like Requests for Python) to query the last 3 hours of data for a running topology. If you require more than 3 hours of data (the maximum stored by the topology TMaster) you will need to use one of the other metrics sinks 将指标发送到外部数据库。 Heron 目前提供了用于保存到本地文件、Graphite 或 Prometheus 的接收器。 InfluxDB 支持正在开发中。