将大数据从 bigquery 加载到 python
load large data from bigquery to python
from google.cloud import bigquery as bq
import google_auth_oauthlib.flow
query = '''select ... from ...'''
bigquery_client = bq.Client()
table = bq.query.QueryResults(query=query,client=bigquery_client)
table.use_legacy_sql = False
table.run()
# transfer bigquery data to pandas dataframe
columns=[field.name for field in table.schema]
rows = table.fetch_data()
data = []
for row in rows:
data.append(row)
df = pd.DataFrame(data=data[0],columns=columns)
我想将超过 1000 万行加载到 python 中,几周前它运行良好,但现在只有 returns 100,000 行。有人知道这样做的可靠方法吗?
我刚刚在这里测试了这段代码,可以带来 300 万行且没有应用上限:
import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path/to/key.json'
from google.cloud.bigquery import Client
bc = Client()
query = 'your query'
job = bc.run_sync_query(query)
job.use_legacy_sql = False
job.run()
data = list(job.fetch_data())
对你有用吗?
from google.cloud import bigquery as bq
import google_auth_oauthlib.flow
query = '''select ... from ...'''
bigquery_client = bq.Client()
table = bq.query.QueryResults(query=query,client=bigquery_client)
table.use_legacy_sql = False
table.run()
# transfer bigquery data to pandas dataframe
columns=[field.name for field in table.schema]
rows = table.fetch_data()
data = []
for row in rows:
data.append(row)
df = pd.DataFrame(data=data[0],columns=columns)
我想将超过 1000 万行加载到 python 中,几周前它运行良好,但现在只有 returns 100,000 行。有人知道这样做的可靠方法吗?
我刚刚在这里测试了这段代码,可以带来 300 万行且没有应用上限:
import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path/to/key.json'
from google.cloud.bigquery import Client
bc = Client()
query = 'your query'
job = bc.run_sync_query(query)
job.use_legacy_sql = False
job.run()
data = list(job.fetch_data())
对你有用吗?