在驱动日志中输出 Spark 应用名称
Output Spark application name in driver log
我需要在驱动程序日志的每一行中输出 Spark 应用程序名称 (spark.app.name
)(以及消息和日期等其他属性)。
到目前为止,我未能找到正确的 log4j 配置或任何其他提示。
怎么做到的?
如有任何帮助,我将不胜感激。
使用 Spark 独立模式。
一种似乎可行的方法涉及以下两个步骤:
创建您的自定义 log4j.properties
文件并更改布局。:
...
# this is just an example layout config
# remember the rest of the configuration
log4j.appender.stdout.layout.ConversionPattern=${appName}--%d{yyyy-mm-dd HH:mm:ss,SSS} [%-5p] [%c] - %m%n
此文件必须位于您的 class 路径的根目录(对于大多数构建工具,如 src/main/resources
中)或在集群中的服务器上编辑 <spark-home>/conf/log4j.properties
。
然后在引导您的 spark 上下文之前使用引用的密钥设置 属性:
System.setProperty("appName", "application-name");
SparkSession spark = SparkSession.builder().appName("application-name")
...
在我的快速测试中,上面的代码在所有行中都产生了类似的结果(在本地模式下测试):
application-name--2020-53-06 16:53:35,741 [INFO ] [org.apache.spark.SparkContext] - Running Spark version 2.4.4
application-name--2020-53-06 16:53:36,032 [WARN ] [org.apache.hadoop.util.NativeCodeLoader] - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
application-name--2020-53-06 16:53:36,316 [INFO ] [org.apache.spark.SparkContext] - Submitted application: JavaWordCount
application-name--2020-53-06 16:53:36,413 [INFO ] [org.apache.spark.SecurityManager] - Changing view acls to: ernest
application-name--2020-53-06 16:53:36,414 [INFO ] [org.apache.spark.SecurityManager] - Changing modify acls to: ernest
application-name--2020-53-06 16:53:36,415 [INFO ] [org.apache.spark.SecurityManager] - Changing view acls groups to:
application-name--2020-53-06 16:53:36,415 [INFO ] [org.apache.spark.SecurityManager] - Changing modify acls groups to:
application-name--2020-53-06 16:53:36,416 [INFO ] [org.apache.spark.SecurityManager] - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ernest); groups with view permissions: Set(); users with modify permissions: Set(ernest); groups with modify permissions: Set()
application-name--2020-53-06 16:53:36,904 [INFO ] [org.apache.spark.util.Utils] - Successfully started service 'sparkDriver' on port 33343.
application-name--2020-53-06 16:53:36,934 [INFO ] [org.apache.spark.SparkEnv] - Registering MapOutputTracker
...
与其在代码中手动设置变量,您可能更愿意使用类似
的方式调用spark-submit
--conf 'spark.driver.extraJavaOptions=-DappName=application-name'
对于更永久的更改,您可能需要编辑 <spark-home>/conf/log4j.properties
(如果文件不存在则复制模板)并更改布局,然后调用 spark-submit
/spark-shell
, 等与系统 属性.
我需要在驱动程序日志的每一行中输出 Spark 应用程序名称 (spark.app.name
)(以及消息和日期等其他属性)。
到目前为止,我未能找到正确的 log4j 配置或任何其他提示。
怎么做到的?
如有任何帮助,我将不胜感激。
使用 Spark 独立模式。
一种似乎可行的方法涉及以下两个步骤:
创建您的自定义
log4j.properties
文件并更改布局。:... # this is just an example layout config # remember the rest of the configuration log4j.appender.stdout.layout.ConversionPattern=${appName}--%d{yyyy-mm-dd HH:mm:ss,SSS} [%-5p] [%c] - %m%n
此文件必须位于您的 class 路径的根目录(对于大多数构建工具,如
src/main/resources
中)或在集群中的服务器上编辑<spark-home>/conf/log4j.properties
。然后在引导您的 spark 上下文之前使用引用的密钥设置 属性:
System.setProperty("appName", "application-name"); SparkSession spark = SparkSession.builder().appName("application-name") ...
在我的快速测试中,上面的代码在所有行中都产生了类似的结果(在本地模式下测试):
application-name--2020-53-06 16:53:35,741 [INFO ] [org.apache.spark.SparkContext] - Running Spark version 2.4.4
application-name--2020-53-06 16:53:36,032 [WARN ] [org.apache.hadoop.util.NativeCodeLoader] - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
application-name--2020-53-06 16:53:36,316 [INFO ] [org.apache.spark.SparkContext] - Submitted application: JavaWordCount
application-name--2020-53-06 16:53:36,413 [INFO ] [org.apache.spark.SecurityManager] - Changing view acls to: ernest
application-name--2020-53-06 16:53:36,414 [INFO ] [org.apache.spark.SecurityManager] - Changing modify acls to: ernest
application-name--2020-53-06 16:53:36,415 [INFO ] [org.apache.spark.SecurityManager] - Changing view acls groups to:
application-name--2020-53-06 16:53:36,415 [INFO ] [org.apache.spark.SecurityManager] - Changing modify acls groups to:
application-name--2020-53-06 16:53:36,416 [INFO ] [org.apache.spark.SecurityManager] - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ernest); groups with view permissions: Set(); users with modify permissions: Set(ernest); groups with modify permissions: Set()
application-name--2020-53-06 16:53:36,904 [INFO ] [org.apache.spark.util.Utils] - Successfully started service 'sparkDriver' on port 33343.
application-name--2020-53-06 16:53:36,934 [INFO ] [org.apache.spark.SparkEnv] - Registering MapOutputTracker
...
与其在代码中手动设置变量,您可能更愿意使用类似
的方式调用spark-submit
--conf 'spark.driver.extraJavaOptions=-DappName=application-name'
对于更永久的更改,您可能需要编辑 <spark-home>/conf/log4j.properties
(如果文件不存在则复制模板)并更改布局,然后调用 spark-submit
/spark-shell
, 等与系统 属性.