在 AWS 上将文件从 s3 读取到 sagemaker 会出现 403 禁止错误,但其他操作可在该文件上运行
Reading a file from s3 to sagemaker on AWS gives 403 forbidden error, but other operations work on the file
这个命令:
BUCKET_TO_READ='my-bucket'
FILE_TO_READ='myFile'
data_location = 's3://{}/{}'.format(BUCKET_TO_READ, FILE_TO_READ)
df=pd.read_csv(data_location)
失败
ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden
错误,我无法弄清楚原因。这应该根据
工作
这是我对存储桶的权限:
"Action": [
"s3:ListMultipartUploadParts",
"s3:ListBucket",
"s3:GetObjectVersionTorrent",
"s3:GetObjectVersionTagging",
"s3:GetObjectVersionAcl",
"s3:GetObjectVersion",
"s3:GetObjectTorrent",
"s3:GetObjectTagging",
"s3:GetObjectAcl",
"s3:GetObject"
这些命令按预期工作:
role = get_execution_role()
region = boto3.Session().region_name
print(role)
print(region)
s3 = boto3.resource('s3')
bucket = s3.Bucket(BUCKET_TO_READ)
print(bucket.creation_date)
for my_bucket_object in bucket.objects.all():
print(my_bucket_object)
FILE_TO_READ = my_bucket_object.key
break
obj = s3.Object(BUCKET_TO_READ, FILE_TO_READ)
print(obj)
所有这些打印语句都运行良好。
我不确定这是否重要,但每个文件都在一个文件夹中,所以我的 FILE_TO_READ 看起来像 folder/file
。
这个应该将文件下载到 sagemaker 的命令也失败了 403:
import boto3
s3 = boto3.resource('s3')
s3.Object(BUCKET_TO_READ, FILE_TO_READ).download_file(FILE_TO_READ)
当我打开终端并使用
时也会发生这种情况
aws s3 cp AWSURI local_file_name
尝试区分 IAM 策略中的对象级别操作和存储桶级别操作。
像这样
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObjectVersionTorrent",
"s3:GetObjectVersionTagging",
"s3:GetObjectVersionAcl",
"s3:GetObjectVersion",
"s3:GetObjectTorrent",
"s3:GetObjectTagging",
"s3:GetObjectAcl",
"s3:GetObject"
],
"Resource": "arn:aws:s3:::bucket-name/*"
},
{
"Effect": "Allow",
"Action": [
"s3:ListMultipartUploadParts",
"s3:ListBucket"
],
"Resource": "arn:aws:s3:::bucket-name"
}
]
}
原因是我们授予了存储桶而不是对象的权限。那将授予 "Resource": "arn:aws:s3:::bucket-name/"
但不授予 "Resource": "arn:aws:s3:::bucket-name/*"
这个命令:
BUCKET_TO_READ='my-bucket'
FILE_TO_READ='myFile'
data_location = 's3://{}/{}'.format(BUCKET_TO_READ, FILE_TO_READ)
df=pd.read_csv(data_location)
失败
ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden
错误,我无法弄清楚原因。这应该根据
这是我对存储桶的权限:
"Action": [
"s3:ListMultipartUploadParts",
"s3:ListBucket",
"s3:GetObjectVersionTorrent",
"s3:GetObjectVersionTagging",
"s3:GetObjectVersionAcl",
"s3:GetObjectVersion",
"s3:GetObjectTorrent",
"s3:GetObjectTagging",
"s3:GetObjectAcl",
"s3:GetObject"
这些命令按预期工作:
role = get_execution_role()
region = boto3.Session().region_name
print(role)
print(region)
s3 = boto3.resource('s3')
bucket = s3.Bucket(BUCKET_TO_READ)
print(bucket.creation_date)
for my_bucket_object in bucket.objects.all():
print(my_bucket_object)
FILE_TO_READ = my_bucket_object.key
break
obj = s3.Object(BUCKET_TO_READ, FILE_TO_READ)
print(obj)
所有这些打印语句都运行良好。
我不确定这是否重要,但每个文件都在一个文件夹中,所以我的 FILE_TO_READ 看起来像 folder/file
。
这个应该将文件下载到 sagemaker 的命令也失败了 403:
import boto3
s3 = boto3.resource('s3')
s3.Object(BUCKET_TO_READ, FILE_TO_READ).download_file(FILE_TO_READ)
当我打开终端并使用
时也会发生这种情况aws s3 cp AWSURI local_file_name
尝试区分 IAM 策略中的对象级别操作和存储桶级别操作。 像这样
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObjectVersionTorrent",
"s3:GetObjectVersionTagging",
"s3:GetObjectVersionAcl",
"s3:GetObjectVersion",
"s3:GetObjectTorrent",
"s3:GetObjectTagging",
"s3:GetObjectAcl",
"s3:GetObject"
],
"Resource": "arn:aws:s3:::bucket-name/*"
},
{
"Effect": "Allow",
"Action": [
"s3:ListMultipartUploadParts",
"s3:ListBucket"
],
"Resource": "arn:aws:s3:::bucket-name"
}
]
}
原因是我们授予了存储桶而不是对象的权限。那将授予 "Resource": "arn:aws:s3:::bucket-name/"
但不授予 "Resource": "arn:aws:s3:::bucket-name/*"