我应该如何处理这个 gsutil "parallel composite upload" 警告?
What should I do about this gsutil "parallel composite upload" warning?
我是运行一个python脚本,使用os
库执行一个gsutil
命令,通常在[=25]的命令提示符下执行=].我的本地计算机上有一些文件,我想将其放入 Google Bucket
,所以我这样做:
导入os
command = 'gsutil -m cp myfile.csv gs://my/bucket/myfile.csv'
os.system(command)
我收到如下消息:
==> NOTE: You are uploading one or more large file(s), which would run significantly faster if you enable parallel composite uploads. This
feature can be enabled by editing the
"parallel_composite_upload_threshold" value in your .boto
configuration file. However, note that if you do this large files will
be uploaded as 'composite objects
https://cloud.google.com/storage/docs/composite-objects'_, which
means that any user who downloads such objects will need to have a
compiled crcmod installed (see "gsutil help crcmod"). This is because
without a compiled crcmod, computing checksums on composite objects is
so slow that gsutil disables downloads of composite objects.
我想通过隐藏它来摆脱这条消息,如果它与实际执行它的建议无关,但我找不到 .boto 文件。我该怎么办?
gsutil
文档的 Parallel Composite Uploads 部分描述了如何解决这个问题(假设,正如警告指定的那样,此内容将由具有 crcmod
模块的客户端使用可用):
gsutil -o GSUtil:parallel_composite_upload_threshold=150M cp bigfile gs://your-bucket
从 Python 安全地执行此操作看起来像:
filename='myfile.csv'
gs_bucket='my/bucket'
parallel_threshold='150M' # minimum size for parallel upload; 0 to disable
subprocess.check_call([
'gsutil',
'-o', 'GSUtil:parallel_composite_upload_threshold=%s' % (parallel_threshold,),
'cp', filename, 'gs://%s/%s' % (gs_bucket, filename)
])
请注意,您在这里明确提供了参数向量边界,而不是依赖 shell 为您做这件事;这可以防止恶意或错误的文件名执行不需要的操作。
如果您不知道访问此存储桶中内容的客户端将具有 crcmod
模块,请考虑设置上面的 parallel_threshold='0'
,这将禁用此支持。
另一种方法是在 BOTO_PATH
的文件中设置提示所说的配置。通常 $HOME/.boto
.
[GSUtil]
parallel_composite_upload_threshold = 150M
为了最大速度安装 crcmod
C 库
我是运行一个python脚本,使用os
库执行一个gsutil
命令,通常在[=25]的命令提示符下执行=].我的本地计算机上有一些文件,我想将其放入 Google Bucket
,所以我这样做:
导入os
command = 'gsutil -m cp myfile.csv gs://my/bucket/myfile.csv'
os.system(command)
我收到如下消息:
==> NOTE: You are uploading one or more large file(s), which would run significantly faster if you enable parallel composite uploads. This feature can be enabled by editing the "parallel_composite_upload_threshold" value in your .boto configuration file. However, note that if you do this large files will be uploaded as 'composite objects https://cloud.google.com/storage/docs/composite-objects'_, which means that any user who downloads such objects will need to have a compiled crcmod installed (see "gsutil help crcmod"). This is because without a compiled crcmod, computing checksums on composite objects is so slow that gsutil disables downloads of composite objects.
我想通过隐藏它来摆脱这条消息,如果它与实际执行它的建议无关,但我找不到 .boto 文件。我该怎么办?
gsutil
文档的 Parallel Composite Uploads 部分描述了如何解决这个问题(假设,正如警告指定的那样,此内容将由具有 crcmod
模块的客户端使用可用):
gsutil -o GSUtil:parallel_composite_upload_threshold=150M cp bigfile gs://your-bucket
从 Python 安全地执行此操作看起来像:
filename='myfile.csv'
gs_bucket='my/bucket'
parallel_threshold='150M' # minimum size for parallel upload; 0 to disable
subprocess.check_call([
'gsutil',
'-o', 'GSUtil:parallel_composite_upload_threshold=%s' % (parallel_threshold,),
'cp', filename, 'gs://%s/%s' % (gs_bucket, filename)
])
请注意,您在这里明确提供了参数向量边界,而不是依赖 shell 为您做这件事;这可以防止恶意或错误的文件名执行不需要的操作。
如果您不知道访问此存储桶中内容的客户端将具有 crcmod
模块,请考虑设置上面的 parallel_threshold='0'
,这将禁用此支持。
另一种方法是在 BOTO_PATH
的文件中设置提示所说的配置。通常 $HOME/.boto
.
[GSUtil]
parallel_composite_upload_threshold = 150M
为了最大速度安装 crcmod
C 库