Google Cloud ML 和 GCS Bucket 问题

Google Cloud ML and GCS Bucket issues

我正在使用研究论文的开源 Tensorflow 实现，例如 DCGAN-tensorflow. Most of the libraries I'm using are configured to train the model locally, but I want to use Google Cloud ML to train the model since I don't have a GPU on my laptop. I'm finding it difficult to change the code to support GCS buckets. At the moment, I'm saving my logs and models to /tmp and then running a 'gsutil' command to copy the directory to gs://my-bucket at the end of training (example here)。如果我尝试将模型直接保存到 gs://my-bucket，它永远不会出现。

至于训练数据，tensorflow 样本之一将数据从 GCS 复制到 /tmp 进行训练 (example here)，但这仅适用于数据集较小的情况。我想使用 celebA，但它太大了，无法每隔运行复制到 /tmp。是否有关于如何更新在本地训练以使用 Google Cloud ML 的代码的文档或指南？

实现是运行Tensorflow 的各种版本，主要是 .11 和 .12

目前没有明确的指南。基本思想是用 file_io 模块中的等效项替换所有出现的本机 Python 文件操作，最值得注意的是：

open() -> file_io.FileIO()
os.path.exists() -> file_io.file_exists()
glob.glob() -> file_io.get_matching_files()

这些功能将在本地和 GCS（以及任何已注册的文件系统）上运行。但是请注意，file_io 和标准文件操作存在一些细微差别（例如，支持一组不同的 'modes'）。

幸运的是，检查点和摘要写作开箱即用，只需确保将 GCS 路径传递给 tf.train.Saver.save and tf.summary.FileWriter。

在您发送的示例中，这看起来可能很痛苦。考虑猴子修补 Python 函数以映射到 TensorFlow 等效项，当程序开始时只需执行一次（已演示）。

附带说明一下，this 页面上的所有示例都显示了从 GCS 读取文件。

Google Cloud ML 和 GCS Bucket 问题

Google Cloud ML and GCS Bucket issues

google-cloud-storage

google-cloud-platform

google-cloud-ml