尝试获取 Socrata python API 调用中的所有字段

Question

我正在使用 python api (sodapy) 从 https://dev.socrata.com/foundry/data.energystar.gov/ebvx-pb7r 获取数据。如何通过api获取本页显示的所有字段？具体来说，我需要 additional_model_information 字段，但我认为还会有其他字段。我试过：

"where additional_model_information is not null"：包含该字段，但我还想要具有空值的行。
"select='additional_model_information' 返回字段，但没有其他内容。
"select=*" 不添加任何字段。
"where='additional_model_information is not null or additional_model_information is null'"这似乎有效。
"select=list all fields" 应该可以，但似乎很笨重。
可以进行 2 次调用，一次获取 * 数据，第二次获取其他字段。

我怀疑我遗漏了什么。感谢任何帮助。

import pprint
import sodapy

client = sodapy.Socrata(domain='data.energystar.gov', app_token=None)
rows = client.get('ebvx-pb7r', where='additional_model_information is not null or additional_model_information is null')
for row in rows:
    pprint.pprint(row)

Answer 1

如果您完全不使用 where 参数，您将得到一个未经过滤的数据集版本，这听起来就是您想要的。您还需要包含 limit 参数以确保获得所有记录，因为总计数为 1058，大于 1000 的默认页面大小：

import pprint
import sodapy

client = sodapy.Socrata(domain='data.energystar.gov', app_token=None)
rows = client.get('ebvx-pb7r', limit=5000)
for row in rows:
    pprint.pprint(row)

我认为部分混淆可能来自这样一个事实，即如果给定记录的 additional_model_information 没有值，我们 return 的 JSON 对象将忽略它场.

尝试获取 Socrata python API 调用中的所有字段

Trying to get all fields in a Socrata python API call

python

socrata