使用 Python API 从 Google Cloud Datalab 将文件上传到 Google Cloud Storage Bucket
Upload files to Google Cloud Storage Bucket from Google Cloud Datalab using Python API
我正在尝试使用 Python API 将文件从笔记本本身的 Datalab 实例上传到我的 Google 存储桶,但我无法弄清楚. Google 在其文档中提供的 code example 在 Datalab 中似乎不起作用。我目前正在使用 gsutil 命令,但想了解如何使用 Python API.
文件目录(我想上传位于 checkpoints 文件夹中的 python 文件):
!ls -R
.:
checkpoints README.md tpot_model.ipynb
./checkpoints:
pipeline_2020.02.29_00-22-17.py pipeline_2020.02.29_06-33-25.py
pipeline_2020.02.29_00-58-04.py pipeline_2020.02.29_07-13-35.py
pipeline_2020.02.29_02-00-52.py pipeline_2020.02.29_08-45-23.py
pipeline_2020.02.29_02-31-57.py pipeline_2020.02.29_09-16-41.py
pipeline_2020.02.29_03-02-51.py pipeline_2020.02.29_11-13-00.py
pipeline_2020.02.29_05-01-17.py
当前代码:
import google.datalab.storage as storage
from pathlib import Path
bucket = storage.Bucket('machine_learning_data_bucket')
for file in Path('').rglob('*.py'):
# API CODE GOES HERE
当前工作解决方案:
!gsutil cp checkpoints/*.py gs://machine_learning_data_bucket
这是对我有用的代码:
from google.cloud import storage
from pathlib import Path
storage_client = storage.Client()
bucket = storage_client.bucket('bucket')
for file in Path('/home/jupyter/folder').rglob('*.py'):
blob = bucket.blob(file.name)
blob.upload_from_filename(str(file))
print("File {} uploaded to {}.".format(file.name,bucket.name))
输出:
File file1.py uploaded to bucket.
File file2.py uploaded to bucket.
File file3.py uploaded to bucket.
编辑
或者您可以使用:
import google.datalab.storage as storage
from pathlib import Path
bucket = storage.Bucket('bucket')
for file in Path('/home/jupyter/folder').rglob('*.py'):
blob = bucket.object(file.name)
blob.write_stream(file.read_text(), 'text/plain')
print("File {} uploaded to {}.".format(file.name,bucket.name))
输出:
File file1.py uploaded to bucket.
File file2.py uploaded to bucket.
File file3.py uploaded to bucket.
我正在尝试使用 Python API 将文件从笔记本本身的 Datalab 实例上传到我的 Google 存储桶,但我无法弄清楚. Google 在其文档中提供的 code example 在 Datalab 中似乎不起作用。我目前正在使用 gsutil 命令,但想了解如何使用 Python API.
文件目录(我想上传位于 checkpoints 文件夹中的 python 文件):
!ls -R
.:
checkpoints README.md tpot_model.ipynb
./checkpoints:
pipeline_2020.02.29_00-22-17.py pipeline_2020.02.29_06-33-25.py
pipeline_2020.02.29_00-58-04.py pipeline_2020.02.29_07-13-35.py
pipeline_2020.02.29_02-00-52.py pipeline_2020.02.29_08-45-23.py
pipeline_2020.02.29_02-31-57.py pipeline_2020.02.29_09-16-41.py
pipeline_2020.02.29_03-02-51.py pipeline_2020.02.29_11-13-00.py
pipeline_2020.02.29_05-01-17.py
当前代码:
import google.datalab.storage as storage
from pathlib import Path
bucket = storage.Bucket('machine_learning_data_bucket')
for file in Path('').rglob('*.py'):
# API CODE GOES HERE
当前工作解决方案:
!gsutil cp checkpoints/*.py gs://machine_learning_data_bucket
这是对我有用的代码:
from google.cloud import storage
from pathlib import Path
storage_client = storage.Client()
bucket = storage_client.bucket('bucket')
for file in Path('/home/jupyter/folder').rglob('*.py'):
blob = bucket.blob(file.name)
blob.upload_from_filename(str(file))
print("File {} uploaded to {}.".format(file.name,bucket.name))
输出:
File file1.py uploaded to bucket.
File file2.py uploaded to bucket.
File file3.py uploaded to bucket.
编辑
或者您可以使用:
import google.datalab.storage as storage
from pathlib import Path
bucket = storage.Bucket('bucket')
for file in Path('/home/jupyter/folder').rglob('*.py'):
blob = bucket.object(file.name)
blob.write_stream(file.read_text(), 'text/plain')
print("File {} uploaded to {}.".format(file.name,bucket.name))
输出:
File file1.py uploaded to bucket.
File file2.py uploaded to bucket.
File file3.py uploaded to bucket.