快速将大型 Salesforce 查询读入 pandas
Read large Salesforce query into pandas quickly
使用 simple_salesforce 连接器,我的查询返回了大约 150k 条记录,而以下将数据读入数据框的方法花费了很长时间,以至于我刚刚进入 SF,运行 一份报告,下载并阅读到 pandas。有没有更快的方法?谢谢
import pandas as pd
from simple_salesforce import Salesforce
fields = ['field' + str(i) for i in range(1, 10)]
fields_str = ", ".join(fields)
query_str = "select {} from account".format(fields_str)
sf = Salesforce(username= myusername, password= mypwd, security_token = mytoken)
df = sf.query_all(query_str)
sf_df = pd.DataFrame(columns = fields)
for account in range(df['totalSize']):
account_dict = {}
for field in fields:
account_dict[field] = df['records'][account][field]
dict_df = pd.DataFrame.from_dict([account_dict])
sf_df = sf_df.append(dict_df, sort=False)
del(account_dict)
您可以使用['records']键直接拉取记录。
df = sf.query_all('SELECT ID, CreatedDate FROM Account LIMIT 10')['records']
df = pd.DataFrame(df)
df
或作为单个代码行:
df = pd.DataFrame(sf.query_all('SELECT ID, Createddate FROM Account LIMIT 10')['records'])
df
如果 attributes
列不包含您要查看的数据,您可以使用 .drop(columns=['attributes']
将其从返回的数据框中删除。
df = sf.query_all('SELECT ID, CreatedDate FROM Account LIMIT 10')['records']
df = pd.DataFrame(df)
df.drop(columns=['attributes'],inplace=True)
df
或作为单个代码行:
df = pd.DataFrame(sf.query_all('SELECT ID, Createddate FROM Account LIMIT 10')['records']).drop(columns=['attributes'])
df
使用 simple_salesforce 连接器,我的查询返回了大约 150k 条记录,而以下将数据读入数据框的方法花费了很长时间,以至于我刚刚进入 SF,运行 一份报告,下载并阅读到 pandas。有没有更快的方法?谢谢
import pandas as pd
from simple_salesforce import Salesforce
fields = ['field' + str(i) for i in range(1, 10)]
fields_str = ", ".join(fields)
query_str = "select {} from account".format(fields_str)
sf = Salesforce(username= myusername, password= mypwd, security_token = mytoken)
df = sf.query_all(query_str)
sf_df = pd.DataFrame(columns = fields)
for account in range(df['totalSize']):
account_dict = {}
for field in fields:
account_dict[field] = df['records'][account][field]
dict_df = pd.DataFrame.from_dict([account_dict])
sf_df = sf_df.append(dict_df, sort=False)
del(account_dict)
您可以使用['records']键直接拉取记录。
df = sf.query_all('SELECT ID, CreatedDate FROM Account LIMIT 10')['records']
df = pd.DataFrame(df)
df
或作为单个代码行:
df = pd.DataFrame(sf.query_all('SELECT ID, Createddate FROM Account LIMIT 10')['records'])
df
如果 attributes
列不包含您要查看的数据,您可以使用 .drop(columns=['attributes']
将其从返回的数据框中删除。
df = sf.query_all('SELECT ID, CreatedDate FROM Account LIMIT 10')['records']
df = pd.DataFrame(df)
df.drop(columns=['attributes'],inplace=True)
df
或作为单个代码行:
df = pd.DataFrame(sf.query_all('SELECT ID, Createddate FROM Account LIMIT 10')['records']).drop(columns=['attributes'])
df