AWS 将文件上传到错误的存储桶中

Question

我正在使用 AWS Sagemaker 并尝试将数据文件夹从 Sagemaker 上传到 S3。我正在尝试做的是将我的数据上传到 s3_train_data 目录（该目录存在于 S3 中）。但是，它不会将其上传到该存储桶中，而是上传到已创建的默认存储桶中，然后创建一个包含 S3_train_data 变量的新文件夹目录。

要在目录中输入的代码

import os
import sagemaker
from sagemaker import get_execution_role

sagemaker_session = sagemaker.Session()
role = get_execution_role()

bucket = <bucket name>
prefix = <folders1/folders2>
key = <input>


s3_train_data = 's3://{}/{}/{}/'.format(bucket, prefix, key)


#path 'data' is the folder in the Jupyter Instance, contains all the training data
inputs = sagemaker_session.upload_data(path= 'data', key_prefix= s3_train_data)

代码中的问题还是我创建笔记本的更多方式？

Answer 1

你可以看看Sample notebooks，如何上传数据S3 bucket 有很多方法。我只是给你提示来回答。而且您忘记创建 boto3 会话来访问 S3 存储桶

这是实现方法之一。

import os 
import urllib.request
import boto3

def download(url):
    filename = url.split("/")[-1]
    if not os.path.exists(filename):
        urllib.request.urlretrieve(url, filename)


def upload_to_s3(channel, file):
    s3 = boto3.resource('s3')
    data = open(file, "rb")
    key = channel + '/' + file
    s3.Bucket(bucket).put_object(Key=key, Body=data)


# caltech-256
download('http://data.mxnet.io/data/caltech-256/caltech-256-60-train.rec')
upload_to_s3('train', 'caltech-256-60-train.rec')
download('http://data.mxnet.io/data/caltech-256/caltech-256-60-val.rec')
upload_to_s3('validation', 'caltech-256-60-val.rec')

link : https://buildcustom.notebook.us-east-2.sagemaker.aws/notebooks/sample-notebooks/introduction_to_amazon_algorithms/imageclassification_caltech/Image-classification-fulltraining.ipynb

另一种方法。

bucket = '<your_s3_bucket_name_here>'# enter your s3 bucket where you will copy data and model artifacts
prefix = 'sagemaker/breast_cancer_prediction' # place to upload training files within the bucket
# do some processing then prepare to push the data. 

f = io.BytesIO()
smac.write_numpy_to_dense_tensor(f, train_X.astype('float32'), train_y.astype('float32'))
f.seek(0)

boto3.Session().resource('s3').Bucket(bucket).Object(os.path.join(prefix, 'train', train_file)).upload_fileobj(f)

Link : https://buildcustom.notebook.us-east-2.sagemaker.aws/notebooks/sample-notebooks/introduction_to_applying_machine_learning/breast_cancer_prediction/Breast%20Cancer%20Prediction.ipynb

Youtube link : https://www.youtube.com/watch?v=-YiHPIGyFGo - 如何拉取 S3 存储桶中的数据。

AWS 将文件上传到错误的存储桶中

AWS uploading file into wrong bucket

python

amazon-web-services

amazon-sagemaker