How to increase AM container size in spark-submit command? ERROR: container is running beyond physical memory limits

How to increase AM container size in spark-submit command? ERROR: container is running beyond physical memory limits

我正在尝试对 AWS 上的某些数据执行 spark 应用程序。我能够在 AWS 上使用 20 m4.large 台机器处理整个数据。现在,我尝试使用 c4.8xlarge 机器进行同样的操作,但出现以下错误:

    AM Container for appattempt_1570270970620_0001_000001 exited with exitCode: -104
    Failing this attempt.Diagnostics: Container [pid=12140,containerID=container_1570270970620_0001_01_000001] is running beyond physical memory limits. Current usage: 1.4 GB of 1.4 GB physical memory used; 3.5 GB of 6.9 GB virtual memory used. Killing container.
    {...}
    Container killed on request. Exit code is 143
    Container exited with a non-zero exit code 143

我用来 运行 集群的命令是:

    spark-submit --deploy-mode cluster --class xyzpackage.xyzclass --master yarn --jar s3://path/xyz_2.11-1.0.jar --arg s3://path_to_files/xy.csv --arg s3://output_path/newfile

应用程序启动时,我看到以下信息:

    19/10/05 10:42:04 INFO RMProxy: Connecting to ResourceManager at ip-172-31-30-66.us-east-2.compute.internal/172.31.30.66:8032
19/10/05 10:42:04 INFO Client: Requesting a new application from cluster with 20 NodeManagers
19/10/05 10:42:04 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (53248 MB per container)
19/10/05 10:42:04 **INFO Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead**
19/10/05 10:42:04 INFO Client: Setting up container launch context for our AM
19/10/05 10:42:04 INFO Client: Setting up the launch environment for our AM container
19/10/05 10:42:04 INFO Client: Preparing resources for our AM container

AM 容器分配了 1408MB~1.4GB 内存,因此出现错误。如何增加 AM 容器内存?我试过了,但没有成功:

spark-submit --deploy-mode cluster --class xyzpackage.xyzclass --master yarn  --conf spark.yarn.executor.memoryOverhead=8000 --driver-memory=91G --jar s3://path/xyz_2.11-1.0.jar --arg s3://path_to_files/* --arg s3://output_path/newfile

如何编辑此命令以增加 AM 容器大小?

我发现了我命令中的错误。更改执行器内存和开销内存的配置命令是:

spark-submit --deploy-mode cluster --class xyzpackage.xyzclass --master yarn  --conf spark.driver.memoryOverhead=2048 --conf spark.executor.memoryOverhead=2048--jar s3://path/xyz_2.11-1.0.jar --arg s3://path_to_files/* --arg s3://output_path/newfile

但我想直接使用 spark 更改执行程序内存。executor/driver.memory 是更好的选择,让 memoryOverhead 为 executor/driver 内存的 10%。