初始化时Dask aws集群错误:用户数据限制为16384字节

Dask aws cluster error when initializing: User data is limited to 16384 bytes

我正在按照此处的指南进行操作:https://cloudprovider.dask.org/en/latest/packer.html#ec2cluster-with-rapids

特别是我用 packer 设置了我的实例,现在正在尝试 运行 最后一段代码:

cluster = EC2Cluster(
            ami=pack_ami,  # AMI ID provided by Packer
            region="eu-west-2",
            docker_image="rapidsai/rapidsai:cuda10.1-runtime-ubuntu18.04-py3.8",
            instance_type="p3.2xlarge",
            bootstrap=False,
            filesystem_size=120,
        )
cluster.scale(1)
client = Client(cluster)

请注意,我必须添加地区以避免抱怨。不幸的是现在我得到这个错误:

botocore.exceptions.ClientError: An error occurred (InvalidParameterValue) when calling the RunInstances operation: User data is limited to 16384 bytes

Creating scheduler instance.

此处完整跟踪:

Creating scheduler instance
Traceback (most recent call last):
  File "tpotmodel.py", line 124, in <module>
    main()
  File "tpotmodel.py", line 83, in main
    bootstrap=False,
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/dask_cloudprovider/aws/ec2.py", line 474, in __init__
    super().__init__(**kwargs)
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/dask_cloudprovider/generic/vmcluster.py", line 284, in __init__
    super().__init__(**kwargs, security=self.security)
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/distributed/deploy/spec.py", line 281, in __init__
    self.sync(self._start)
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/distributed/deploy/cluster.py", line 189, in sync
    return sync(self.loop, func, *args, **kwargs)
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/distributed/utils.py", line 340, in sync
    raise exc.with_traceback(tb)
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/distributed/utils.py", line 324, in f
    result[0] = yield future
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/tornado/gen.py", line 762, in run
    value = future.result()
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/dask_cloudprovider/generic/vmcluster.py", line 324, in _start
    await super()._start()
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/distributed/deploy/spec.py", line 309, in _start
    self.scheduler = await self.scheduler
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/distributed/deploy/spec.py", line 71, in _
    await self.start()
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/dask_cloudprovider/generic/vmcluster.py", line 86, in start
    ip = await self.create_vm()
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/dask_cloudprovider/aws/ec2.py", line 139, in create_vm
    response = await client.run_instances(**vm_kwargs)
  File "/home/simon/.conda/envs/tpot-cuml/lib/python3.7/site-packages/aiobotocore/client.py", line 154, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (InvalidParameterValue) when calling the RunInstances operation: User data is limited to 16384 bytes

如果 conda 有任何改变,我将全力以赴。

Dask 社区正在跟踪此问题:github.com/dask/dask-cloudprovider/issues/249 and a potential solution github.com/dask/distributed/pull/4465。 4465 应该可以解决问题。