Google Google Colab TPU 定价的云存储
Google Cloud Storage for Google Colab TPU pricing
我想将 Google Colab 免费 TPU 与自定义数据集一起使用,这就是我需要将其上传到 GCS 的原因。我在 GCS 中创建了存储桶并上传了数据集。
我还读到在 GCS 中有两个 class 对数据的操作:操作 class A 和操作 class B [reference].
我的问题是:在 Google Colab 中从 GCS 访问数据集是否属于这些操作之一 classes?您为 Colab TPU 使用 GCS 支付的平均价格是多少?
Yes, accessing to the Objects (files) in your GCS bucket will result in possible charges to your Billing Account but there are some other factors that you might need to consider. Let me explain (sorry in advance for the very long answer):
Google Cloud Platform services use APIs behind the scene to perform multiple actions such as show, create, delete or edit certain resources.
Cloud Storage is not the Exception. As mentioned in the Cloud Storage docs operations can be cataloged in two: the ones performed by the JSON API and the ones done by the XML API.
All operations performed on the Cloud Console or Client libraries (the ones used to interact via code with languages like Python, Java, PHP, etc.), the Operations will be charged using the JSON API by default. Let's focus on this one.
I want you to pay attention at the name of the methods under each Operations column:
The structure can be read as follows:
service.resource.action
Since all these methods are related to the Cloud Storage service, it is normal to see the storage service in all of them.
In Operations B column, the first method is storage.*.get
. There is no other get method in the other columns which means that retrieving information from a bucket (read metadata) or objects (read a file via code, download files, etc.) inside a bucket will be considered as part of this method.
Before talking about how to calculate costs let me add: Google Cloud Storage not only charges you for the action itself but also for the size of the file traveling among the Network. Here are the 2 most common scenarios:
You are interacting with the files from another GCP service. Since it uses the internal GCP network, charges are not that big. If you decide to go with this, I would recommend to use resources (App Engine, Compute Engine, Kubernetes Engine, etc.) in the same location to avoid additional charges. Please check the Network egress charges within GCP.
You are interacting from an environment outside GCP. This is the scenario where you are interacting with other services like Google Colab (even when it is a Google service, it is outside the Cloud Platform). Please see the General network usage pricing for Cloud Storage.
Now, let's talk about the Storage classes, which can also affect the object's availability and pricing. Depending on where the bucket is created, you will be charged for the amount of stored Data as mentioned in the docs.
Even when the Nearline, Coldline and Archive classes is the cheapest ones regarding storage, they charge you an extra for retrieving data. This is because these classes are meant to be used to store data that is infrequently.
I think we have covered everything and we can move now to the important question: How much all of this will cost? It depends on your files' size, the times you interact with them and the Storage class of your bucket.
Let's say that you have 1 Standard bucket in North America with your Dataset of 20 GB and you read it from Google Colab 10 times a day we can calculate the following:
Standard Storage: $0.020 per GB
[=11=].020 * 20 = [=11=].4USD
Class B operations (per 10,000 operations) for Standard operations: $0.004
Given that you are only charged [=12=].004 per 10,000 we can say that each operation
costs [=12=].0000004 USD so 10 operations will be [=12=].000004 USD.
Egress to Worldwide Destinations (excluding Asia & Australia): $0.12 per GB
[=13=].12 * 20 because it is the size of our file = .4 USD
10 times we are reading this doc per day: 2.4 * 10 = USD
Given this example, you would pay per day: 0.4 + 0.000004 + 24 = $24.400004 USD. Another example can be found in the Pricing overview section
And finally the good news, Google Cloud Storage offers Always Free usage limits that reset every month. I am attaching the table from that link below:
This means that: if during a whole month you store less than 5 GBs in a Standard class bucket, you perform less than 50,000 Class B operations, less than 5,000 Class A Operations and you sent less than 1GB over the Network, you won't pay a thing.
Once you pass those limits, the charges will start i.e. If you have a Dataset of 15GB, you will only be charged for 10GB.
我想将 Google Colab 免费 TPU 与自定义数据集一起使用,这就是我需要将其上传到 GCS 的原因。我在 GCS 中创建了存储桶并上传了数据集。
我还读到在 GCS 中有两个 class 对数据的操作:操作 class A 和操作 class B [reference].
我的问题是:在 Google Colab 中从 GCS 访问数据集是否属于这些操作之一 classes?您为 Colab TPU 使用 GCS 支付的平均价格是多少?
Yes, accessing to the Objects (files) in your GCS bucket will result in possible charges to your Billing Account but there are some other factors that you might need to consider. Let me explain (sorry in advance for the very long answer):
Google Cloud Platform services use APIs behind the scene to perform multiple actions such as show, create, delete or edit certain resources.
Cloud Storage is not the Exception. As mentioned in the Cloud Storage docs operations can be cataloged in two: the ones performed by the JSON API and the ones done by the XML API.
All operations performed on the Cloud Console or Client libraries (the ones used to interact via code with languages like Python, Java, PHP, etc.), the Operations will be charged using the JSON API by default. Let's focus on this one.
I want you to pay attention at the name of the methods under each Operations column:
The structure can be read as follows:
service.resource.action
Since all these methods are related to the Cloud Storage service, it is normal to see the storage service in all of them.
In Operations B column, the first method is storage.*.get
. There is no other get method in the other columns which means that retrieving information from a bucket (read metadata) or objects (read a file via code, download files, etc.) inside a bucket will be considered as part of this method.
Before talking about how to calculate costs let me add: Google Cloud Storage not only charges you for the action itself but also for the size of the file traveling among the Network. Here are the 2 most common scenarios:
You are interacting with the files from another GCP service. Since it uses the internal GCP network, charges are not that big. If you decide to go with this, I would recommend to use resources (App Engine, Compute Engine, Kubernetes Engine, etc.) in the same location to avoid additional charges. Please check the Network egress charges within GCP.
You are interacting from an environment outside GCP. This is the scenario where you are interacting with other services like Google Colab (even when it is a Google service, it is outside the Cloud Platform). Please see the General network usage pricing for Cloud Storage.
Now, let's talk about the Storage classes, which can also affect the object's availability and pricing. Depending on where the bucket is created, you will be charged for the amount of stored Data as mentioned in the docs.
Even when the Nearline, Coldline and Archive classes is the cheapest ones regarding storage, they charge you an extra for retrieving data. This is because these classes are meant to be used to store data that is infrequently.
I think we have covered everything and we can move now to the important question: How much all of this will cost? It depends on your files' size, the times you interact with them and the Storage class of your bucket.
Let's say that you have 1 Standard bucket in North America with your Dataset of 20 GB and you read it from Google Colab 10 times a day we can calculate the following:
Standard Storage: $0.020 per GB
[=11=].020 * 20 = [=11=].4USD
Class B operations (per 10,000 operations) for Standard operations: $0.004
Given that you are only charged [=12=].004 per 10,000 we can say that each operation
costs [=12=].0000004 USD so 10 operations will be [=12=].000004 USD.
Egress to Worldwide Destinations (excluding Asia & Australia): $0.12 per GB
[=13=].12 * 20 because it is the size of our file = .4 USD
10 times we are reading this doc per day: 2.4 * 10 = USD
Given this example, you would pay per day: 0.4 + 0.000004 + 24 = $24.400004 USD. Another example can be found in the Pricing overview section
And finally the good news, Google Cloud Storage offers Always Free usage limits that reset every month. I am attaching the table from that link below:
This means that: if during a whole month you store less than 5 GBs in a Standard class bucket, you perform less than 50,000 Class B operations, less than 5,000 Class A Operations and you sent less than 1GB over the Network, you won't pay a thing.
Once you pass those limits, the charges will start i.e. If you have a Dataset of 15GB, you will only be charged for 10GB.