R bigrquery：超出速率限制

Question

我正在尝试使用以下代码从 Google Cloud Platform 下载 BigQuery 数据集到 R 工作区以便对其进行分析：

library(bigrquery)
library(DBI)
library(tidyverse)
library(dplyr)


con = dbConnect(
  bigquery(),
  project = "bigquery-public-data",
  dataset = "new_york_citibike",
  billing = "maanan-bigquery-in-r"
)

bigrquery::bq_auth()

my_db_pointer = tbl(con, "citibike_trips")

glimpse(my_db_pointer)

count(my_db_pointer)

selected  =  select(my_db_pointer, everything()) %>% collect()

但是，当我尝试运行最后一行来下载数据时，returns出现以下错误：

Complete
Billed: 0 B
Downloading first chunk of data.
Received 55,308 rows in the first chunk.
Downloading the remaining 58,882,407 rows in 1420 chunks of (up to) 41,481 rows.
Downloading data [=====>--------------------------------------------------------------------------------------------------]   6% ETA: 19m
Error in `signal_reason()`:
! Exceeded rate limits: Your project:453562790213 exceeded quota for tabledata.list bytes per second per project. For more information, see https://cloud.google.com/bigquery/troubleshooting-errors [rateLimitExceeded] 
ℹ Try increasing the `page_size` value of `bq_table_download()`
Run `rlang::last_error()` to see where the error occurred.

如果有人能帮我修复这个错误并下载数据，我将不胜感激？我需要分析数据集。提前谢谢你。

Answer 1

根据文档 link 关于 rateLimitExceeded, it looks like you break the threshold of query jobs。

请考虑以下事项：

检查您的项目 bigquery api 是否有您在执行操作时可能会违反的设置限制和配额。要查看您当前的配额和限制，请转至 IAM 和管理 > 配额 > 项目“projectid”的配额“ > bigquery.google.apis.com
由于您的块大约是 55,308 rows 每个 58,882,407 rows 的块，您似乎正在尝试下载它允许的更多数据，并且您可能会达到以下限制： Query/script execution-time limit, Maximum response size, Maximum row size.
验证是否未达到 table constraints。特别是关于 operations per day.
的那个
检查您的行的列数。有 10,000 列的限制。
考虑检查 query jobs 上指定的所有其余配额限制。

缩小 select 的范围或减小块的大小。百万记录其真正需要的一切表？。您可以执行如下操作：

library(bigrquery)

# authenticate
# use if notebook is outside gcp
#bigrquery::bq_auth(path = '/Users/me/restofthepath/bigquery- credentials.json')

bq_table_download("my-project-id.dataset-id.table", page_size = 100)

有关此功能的更多详细信息，请查看 bq_table_download

R bigrquery：超出速率限制

R bigrquery: Exceeded rate limits

r

dbi

google-bigquery

tidyverse

bigrquery