如何将 Spark 指标发送到独立集群上的 Graphite?

How to send Spark metrics to Graphite on Standalone cluster?

我尝试使用以下配置将 Spark 指标发送到 Graphite:

*.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink
*.sink.graphite.host=85.10.206.170
*.sink.graphite.port=2003
*.sink.graphite.period=1
*.sink.graphite.unit=minutes

# Enable jvm source for instance master, worker, driver and executor
master.source.jvm.class=org.apache.spark.metrics.source.JvmSource

worker.source.jvm.class=org.apache.spark.metrics.source.JvmSource

driver.source.jvm.class=org.apache.spark.metrics.source.JvmSource

executor.source.jvm.class=org.apache.spark.metrics.source.JvmSource

application.source.jvm.class=org.apache.spark.metrics.source.JvmSource

保存于 /data/configurations/metrics.properties

我提交具有这些属性的申请:

--files=/data/configuration/metrics.properties --conf spark.metrics.conf=metrics.properties

我收到以下错误:

com.test.MyApp: metrics.properties (No such file or directory)
 java.io.FileNotFoundException: metrics.properties (No such file or directory)
    at java.io.FileInputStream.open0(Native Method) ~[?:1.8.0_45]
    at java.io.FileInputStream.open(FileInputStream.java:195) ~[?:1.8.0_45]
    at java.io.FileInputStream.<init>(FileInputStream.java:138) ~[?:1.8.0_45]
    at java.io.FileInputStream.<init>(FileInputStream.java:93) ~[?:1.8.0_45]
    at org.apache.spark.metrics.MetricsConfig$$anonfun.apply(MetricsConfig.scala:50) ~[spark-assembly-1.4.1-hadoop2.4.0.jar:1.4.1]
    at org.apache.spark.metrics.MetricsConfig$$anonfun.apply(MetricsConfig.scala:50) ~[spark-assembly-1.4.1-hadoop2.4.0.jar:1.4.1]
    at scala.Option.map(Option.scala:145) ~[spark-assembly-1.4.1-hadoop2.4.0.jar:1.4.1]
    at org.apache.spark.metrics.MetricsConfig.initialize(MetricsConfig.scala:50) ~[spark-assembly-1.4.1-hadoop2.4.0.jar:1.4.1]
    at org.apache.spark.metrics.MetricsSystem.<init>(MetricsSystem.scala:93) ~[spark-assembly-1.4.1-hadoop2.4.0.jar:1.4.1]
    at org.apache.spark.metrics.MetricsSystem$.createMetricsSystem(MetricsSystem.scala:222) ~[spark-assembly-1.4.1-hadoop2.4.0.jar:1.4.1]
    at org.apache.spark.SparkEnv$.create(SparkEnv.scala:361) ~[spark-assembly-1.4.1-hadoop2.4.0.jar:1.4.1]
    at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:188) ~[spark-assembly-1.4.1-hadoop2.4.0.jar:1.4.1]
    at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:267) ~[spark-assembly-1.4.1-hadoop2.4.0.jar:1.4.1]
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:424) ~[spark-assembly-1.4.1-hadoop2.4.0.jar:1.4.1]
    at org.apache.spark.streaming.StreamingContext$.createNewSparkContext(StreamingContext.scala:842) ~[spark-assembly-1.4.1-hadoop2.4.0.jar:1.4.1]
    at org.apache.spark.streaming.StreamingContext.<init>(StreamingContext.scala:80) ~[spark-assembly-1.4.1-hadoop2.4.0.jar:1.4.1]
    at org.apache.spark.streaming.api.java.JavaStreamingContext.<init>(JavaStreamingContext.scala:133) ~[spark-assembly-1.4.1-hadoop2.4.0.jar:1.4.1]

我哪里错了?

tl;dr spark.metrics.conf 应该是绝对路径。

注意:星号 (*) 指的是 Spark 中可用的 任何 指标源,可以是 driverexecutor、外部 shuffleService, master, applications, worker, mesos_cluster.

提示:您可以使用服务的相应 URL 访问指标,例如4040 给 driver,8080 给 Spark Standalone 的 master 和 applications,using http://localhost:[port]/metrics/json/ URL.