AWS Sagemaker - ClientError: Data download failed
AWS Sagemaker - ClientError: Data download failed
问题:
我正在尝试在 Sagemaker 中设置模型,但是在下载数据时失败了。
有谁知道我做错了什么?
到目前为止我做了什么:
为了避免我这边出现任何错误,我决定使用 AWS 教程:
tensorflow_iris_dnn_classifier_using_estimators
我只做了两处改动:
- 我将数据集复制到我自己的 S3 实例中。 --> 我测试了我是否可以访问/显示数据并且它有效。
- 我编辑了指向新文件夹的路径。
%%time
import boto3
# use the region-specific sample data bucket
region = boto3.Session().region_name
#train_data_location = 's3://sagemaker-sample-data-{}/tensorflow/iris'.format(region)
train_data_location = 's3://my-s3-bucket'
iris_estimator.fit(train_data_location)
这是我得到的错误:
/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/IPython/core/interactiveshell.pyc in run_cell_magic(self, magic_name, line, cell)
2115 magic_arg_s = self.var_expand(line, stack_depth)
2116 with self.builtin_trap:
-> 2117 result = fn(magic_arg_s, cell)
2118 return result
2119
<decorator-gen-60> in time(self, line, cell, local_ns)
/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/IPython/core/magic.pyc in <lambda>(f, *a, **k)
186 # but it's overkill for just that one bit of state.
187 def magic_deco(arg):
--> 188 call = lambda f, *a, **k: f(*a, **k)
189
190 if callable(arg):
/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/IPython/core/magics/execution.pyc in time(self, line, cell, local_ns)
1191 else:
1192 st = clock2()
-> 1193 exec(code, glob, local_ns)
1194 end = clock2()
1195 out = None
<timed exec> in <module>()
/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/sagemaker/tensorflow/estimator.pyc in fit(self, inputs, wait, logs, job_name, run_tensorboard_locally)
314 tensorboard.join()
315 else:
--> 316 fit_super()
317
318 @classmethod
/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/sagemaker/tensorflow/estimator.pyc in fit_super()
293
294 def fit_super():
--> 295 super(TensorFlow, self).fit(inputs, wait, logs, job_name)
296
297 if run_tensorboard_locally and wait is False:
/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/sagemaker/estimator.pyc in fit(self, inputs, wait, logs, job_name)
232 self.latest_training_job = _TrainingJob.start_new(self, inputs)
233 if wait:
--> 234 self.latest_training_job.wait(logs=logs)
235
236 def _compilation_job_name(self):
/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/sagemaker/estimator.pyc in wait(self, logs)
571 def wait(self, logs=True):
572 if logs:
--> 573 self.sagemaker_session.logs_for_job(self.job_name, wait=True)
574 else:
575 self.sagemaker_session.wait_for_job(self.job_name)
/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/sagemaker/session.pyc in logs_for_job(self, job_name, wait, poll)
1126
1127 if wait:
-> 1128 self._check_job_status(job_name, description, 'TrainingJobStatus')
1129 if dot:
1130 print()
/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/sagemaker/session.pyc in _check_job_status(self, job, desc, status_key_name)
826 reason = desc.get('FailureReason', '(No reason provided)')
827 job_type = status_key_name.replace('JobStatus', ' job')
--> 828 raise ValueError('Error for {} {}: {} Reason: {}'.format(job_type, job, status, reason))
829
830 def wait_for_endpoint(self, endpoint, poll=5):
ValueError: Error for Training job sagemaker-tensorflow-2019-01-03-16-32-16-435: Failed Reason: ClientError: Data download failed:S3 key: s3://my-s3-bucket//sagemaker-tensorflow-2019-01-03-14-02-39-959/source/sourcedir.tar.gz has an illegal char sub-sequence '//' in it
脚本期望 'bucket' 为 bucket = Session().default_bucket() 或您自己的。您是否尝试过将存储桶设置为您的个人存储桶?
您收到的完整错误消息似乎是:
ClientError: Data download failed:S3 key: s3://my-s3-bucket//sagemaker-tensorflow-2019-01-03-14-02-39-959/source/sourcedir.tar.gz has an illegal char sub-sequence '//' in it
修复密钥后问题是否仍然存在?
我有类似的。必须只更改输出的名称,前面没有任何内容,否则它会给我双重 '//' 错误。所以就做 'my-s3-bucket'
没有。确保它只是你的输出名称而不是存储桶名称所以我的是 'vanias bucket/results' 我将它更改为 'results' 并且它有效。祝你好运!
问题: 我正在尝试在 Sagemaker 中设置模型,但是在下载数据时失败了。 有谁知道我做错了什么?
到目前为止我做了什么: 为了避免我这边出现任何错误,我决定使用 AWS 教程: tensorflow_iris_dnn_classifier_using_estimators
我只做了两处改动:
- 我将数据集复制到我自己的 S3 实例中。 --> 我测试了我是否可以访问/显示数据并且它有效。
- 我编辑了指向新文件夹的路径。
%%time
import boto3
# use the region-specific sample data bucket
region = boto3.Session().region_name
#train_data_location = 's3://sagemaker-sample-data-{}/tensorflow/iris'.format(region)
train_data_location = 's3://my-s3-bucket'
iris_estimator.fit(train_data_location)
这是我得到的错误:
/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/IPython/core/interactiveshell.pyc in run_cell_magic(self, magic_name, line, cell)
2115 magic_arg_s = self.var_expand(line, stack_depth)
2116 with self.builtin_trap:
-> 2117 result = fn(magic_arg_s, cell)
2118 return result
2119
<decorator-gen-60> in time(self, line, cell, local_ns)
/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/IPython/core/magic.pyc in <lambda>(f, *a, **k)
186 # but it's overkill for just that one bit of state.
187 def magic_deco(arg):
--> 188 call = lambda f, *a, **k: f(*a, **k)
189
190 if callable(arg):
/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/IPython/core/magics/execution.pyc in time(self, line, cell, local_ns)
1191 else:
1192 st = clock2()
-> 1193 exec(code, glob, local_ns)
1194 end = clock2()
1195 out = None
<timed exec> in <module>()
/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/sagemaker/tensorflow/estimator.pyc in fit(self, inputs, wait, logs, job_name, run_tensorboard_locally)
314 tensorboard.join()
315 else:
--> 316 fit_super()
317
318 @classmethod
/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/sagemaker/tensorflow/estimator.pyc in fit_super()
293
294 def fit_super():
--> 295 super(TensorFlow, self).fit(inputs, wait, logs, job_name)
296
297 if run_tensorboard_locally and wait is False:
/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/sagemaker/estimator.pyc in fit(self, inputs, wait, logs, job_name)
232 self.latest_training_job = _TrainingJob.start_new(self, inputs)
233 if wait:
--> 234 self.latest_training_job.wait(logs=logs)
235
236 def _compilation_job_name(self):
/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/sagemaker/estimator.pyc in wait(self, logs)
571 def wait(self, logs=True):
572 if logs:
--> 573 self.sagemaker_session.logs_for_job(self.job_name, wait=True)
574 else:
575 self.sagemaker_session.wait_for_job(self.job_name)
/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/sagemaker/session.pyc in logs_for_job(self, job_name, wait, poll)
1126
1127 if wait:
-> 1128 self._check_job_status(job_name, description, 'TrainingJobStatus')
1129 if dot:
1130 print()
/home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/sagemaker/session.pyc in _check_job_status(self, job, desc, status_key_name)
826 reason = desc.get('FailureReason', '(No reason provided)')
827 job_type = status_key_name.replace('JobStatus', ' job')
--> 828 raise ValueError('Error for {} {}: {} Reason: {}'.format(job_type, job, status, reason))
829
830 def wait_for_endpoint(self, endpoint, poll=5):
ValueError: Error for Training job sagemaker-tensorflow-2019-01-03-16-32-16-435: Failed Reason: ClientError: Data download failed:S3 key: s3://my-s3-bucket//sagemaker-tensorflow-2019-01-03-14-02-39-959/source/sourcedir.tar.gz has an illegal char sub-sequence '//' in it
脚本期望 'bucket' 为 bucket = Session().default_bucket() 或您自己的。您是否尝试过将存储桶设置为您的个人存储桶?
您收到的完整错误消息似乎是:
ClientError: Data download failed:S3 key: s3://my-s3-bucket//sagemaker-tensorflow-2019-01-03-14-02-39-959/source/sourcedir.tar.gz has an illegal char sub-sequence '//' in it
修复密钥后问题是否仍然存在?
我有类似的。必须只更改输出的名称,前面没有任何内容,否则它会给我双重 '//' 错误。所以就做 'my-s3-bucket'
没有。确保它只是你的输出名称而不是存储桶名称所以我的是 'vanias bucket/results' 我将它更改为 'results' 并且它有效。祝你好运!