从 JSON 中提取数据并使用 python 进行迭代

Extracting data from JSON and iterating using python

所以我正在使用 HubSpot API 在系统中跟踪 "deals",但没有可靠的方法 search/filter 它 return 的数据.所以它只是将系统中的所有 "deals" 转储到一个大的 JSON 中,然后为您提供一些分页信息以帮助您在后端将其粘合在一起。

这是一个demo/sourceAPIURL: https://api.hubapi.com/deals/v1/deal/all?hapikey=demo

这个 returns JSON 基本上是这样的(缩写):

{  
   "deals":[  
      {  
         "portalId":62515,
         "dealId":17886969,
         "isDeleted":false,
         "associations":{
            "associatedCompanyIds":[  
               113448746
            ]
         },
         "properties":{
            "dealname":{  
               "value":"Google Website"
            },
            "amount":{  
               "value":"150000"
            },
            "hubspot_owner_id":{  
               "value":"72"
            },
            "dealstage":{  
               "value":"qualifiedtobuy"
            },
            "dealtype":{  
               "value":"existingbusiness"
            }
         },
         "imports":[  

         ]
      }
   ],
   "hasMore":false,
   "offset":28692358
}

但是...

1) 有很多交易 returned,我想通过 dealtype 过滤它,这是一个可以改变的变量。例如,我只想 return 具有价值 "qualifiedtobuy"

的交易

2) 然后,我需要 运行 查询,并根据 dealId 对每笔交易进行 "do stuff"。我想这意味着我需要将以上所有内容都放入 dict 中,然后以某种方式进行迭代?我不确定。

所以这就是我到目前为止所拥有的,但这实际上只是抓住 JSON 并将其变成 dict(我认为)。

import requests
import json

apikey = "demo"
url = 'https://api.hubapi.com/deals/v1/deal/all?hapikey=' + apikey
response = requests.get(url)
response.raise_for_status()
jsonDeals = response.json()
dict_object = dict(jsonDeals)

我不确定接下来要采取什么步骤来获取 dealId 值,然后 "do stuff" 每个值 returned.

非常感谢任何帮助。

更新:这里是原始字典,未针对单笔交易的属性进行简化:

{u'hs_lastmodifieddate': {u'sourceId': None, u'timestamp': 1457479099306, u'versions': [{u'timestamp': 1457479099306, u'sourceVid': [], u'name': u'hs_lastmodifieddate', u'value': u'1457479099306', u'source': u'CALCULATED'}], u'value': u'1457479099306', u'source': u'CALCULATED'}, u'pipeline': {u'sourceId': None, u'timestamp': 1457479063182, u'versions': [{u'timestamp': 1457479063182, u'name': u'pipeline', u'value': u'default', u'sourceVid': []}], u'value': u'default', u'source': None}, u'num_associated_contacts': {u'sourceId': None, u'timestamp': 0, u'versions': [{u'source': u'CALCULATED', u'name': u'num_associated_contacts', u'value': u'0', u'sourceVid': []}], u'value': u'0', u'source': u'CALCULATED'}, u'dealstage': {u'sourceId': None, u'timestamp': 1457479063157, u'versions': [{u'timestamp': 1457479063157, u'sourceVid': [], u'name': u'dealstage', u'value': u'qualifiedtobuy', u'source': u'API'}], u'value': u'qualifiedtobuy', u'source': u'API'}, u'createdate': {u'sourceId': None, u'timestamp': 1457479063181, u'versions': [{u'timestamp': 1457479063181, u'name': u'createdate', u'value': u'1457479063181', u'sourceVid': []}], u'value': u'1457479063181', u'source': None}, u'hs_salesforceopportunityid': {u'sourceId': None, u'timestamp': 1457479097680, u'versions': [{u'timestamp': 1457479097680, u'sourceVid': [], u'name': u'hs_salesforceopportunityid', u'value': u'00628000007nhqFAAQ', u'source': u'SALESFORCE'}], u'value': u'00628000007nhqFAAQ', u'source': u'SALESFORCE'}, u'hubspot_owner_assigneddate': {u'sourceId': None, u'timestamp': 1457479097680, u'versions': [{u'timestamp': 1457479097680, u'sourceVid': [], u'name': u'hubspot_owner_assigneddate', u'value': u'1457479097680', u'source': u'SALESFORCE'}], u'value': u'1457479097680', u'source': u'SALESFORCE'}, u'hubspot_owner_id': {u'sourceId': None, u'timestamp': 1457479097680, u'versions': [{u'timestamp': 1457479097680, u'sourceVid': [], u'name': u'hubspot_owner_id', u'value': u'11626092', u'source': u'SALESFORCE'}], u'value': u'11626092', u'source': u'SALESFORCE'}, u'amount': {u'sourceId': None, u'timestamp': 1457479063157, u'versions': [{u'timestamp': 1457479063157, u'sourceVid': [], u'name': u'amount', u'value': u'150000', u'source': u'API'}], u'value': u'150000', u'source': u'API'}, u'hs_createdate': {u'sourceId': None, u'timestamp': 1457479063181, u'versions': [{u'timestamp': 1457479063181, u'name': u'hs_createdate', u'value': u'1457479063181', u'sourceVid': []}], u'value': u'1457479063181', u'source': None}, u'salesforcelastsynctime': {u'sourceId': None, u'timestamp': 1457479099298, u'versions': [{u'timestamp': 1457479099298, u'sourceVid': [], u'name': u'salesforcelastsynctime', u'value': u'1457479070904', u'source': u'SALESFORCE'}], u'value': u'1457479070904', u'source': u'SALESFORCE'}, u'closedate': {u'sourceId': None, u'timestamp': 1457479099298, u'versions': [{u'timestamp': 1457479099298, u'sourceVid': [], u'name': u'closedate', u'value': u'1461013200000', u'source': u'SALESFORCE'}], u'value': u'1461013200000', u'source': u'SALESFORCE'}, u'dealtype': {u'sourceId': None, u'timestamp': 1457479063157, u'versions': [{u'timestamp': 1457479063157, u'sourceVid': [], u'name': u'dealtype', u'value': u'existingbusiness', u'source': u'API'}], u'value': u'existingbusiness', u'source': u'API'}, u'dealname': {u'sourceId': None, u'timestamp': 1457479063157, u'versions': [{u'timestamp': 1457479063157, u'sourceVid': [], u'name': u'dealname', u'value': u'Google Website', u'source': u'API'}], u'value': u'Google Website', u'source': u'API'}}
import json
dict_object = json.loads(jsonDeals)

会给你你的字典你不想只是将字符串转换为字典,因为一些 json 组件与 python 中的组件不同,例如小写 true 与 python 是真的

要与项目交互,您只需要根据键索引到字典中

for deal in dict_object['deals']:
     print deal['dealId'], 'for instance'

你能确认 jsonDeals 是一个 json 字符串吗...如果它是你在问题顶部发布的内容,它应该可以工作,我已经测试过了:

>>> a = """
... {  
...    "deals":[  
...       {  
...          "portalId":62515,
...          "dealId":17886969,
...          "isDeleted":false,
...          "associations":{
...             "associatedCompanyIds":[  
...                113448746
...             ]
...          },
...          "properties":{
...             "dealname":{  
...                "value":"Google Website"
...             },
...             "amount":{  
...                "value":"150000"
...             },
...             "hubspot_owner_id":{  
...                "value":"72"
...             },
...             "dealstage":{  
...                "value":"qualifiedtobuy"
...             },
...             "dealtype":{  
...                "value":"existingbusiness"
...             }
...          },
...          "imports":[  
... 
...          ]
...       }
...    ],
...    "hasMore":false,
...    "offset":28692358
... }"""
>>> import json
>>> d = json.loads(a)
>>> 

response.json() 返回的对象已经转换为 dict,因此您无需对其进行任何进一步操作。要获取所有 qualifiedtobuy 交易的列表,请尝试以下操作:

jsonDeals = response.json()

deals = []
for deal in jsonDeals['deals']:
    properties = deal['properties']
    if ('dealstage' in properties and
        properties['dealstage']['value'] == 'qualifiedtobuy'):
        deals.append(deal)

if deals:
    print(deals[0]['dealId'])
else:
    print('found no "qualifiedtobuy" deals')

Python 当然是一种简单的方法并且更具可读性。有时我很懒,所以我只使用 jq (Doc : https://stedolan.github.io/jq/).

对于您的示例,它可以是 Bash 中的一行:

curl -s https://api.hubapi.com/deals/v1/deal/all?hapikey=demo | \
    jq '.deals[] | select(.properties.dealstage.value=="qualifiedtobuy")'

这将以漂亮的格式为您提供所需的一行。将 --compact-output 标志添加到 jq 将在一行中打印每个对象。然后,如果您需要更复杂的操作,您可以将输出重定向到一个文件。

在您的测试 url 的情况下,使用标志它只输出一条记录:

{"portalId":62515,"dealId":17886969,"isDeleted":false,"associations":{"associatedVids":[],"associatedCompanyIds":[113448746],"associatedDealIds":[]},"proper
ties":{"pipeline":{"value":"default","timestamp":1456622756943,"source":null,"sourceId":null,"versions":[{"name":"pipeline","value":"default","timestamp":14
56622756943,"sourceVid":[]}]},"dealname":{"value":"Google Website","timestamp":1456622756908,"source":"API","sourceId":null,"versions":[{"name":"dealname","
value":"Google Website","timestamp":1456622756908,"source":"API","sourceVid":[]}]},"amount":{"value":"150000","timestamp":1456622756908,"source":"API","sour
ceId":null,"versions":[{"name":"amount","value":"150000","timestamp":1456622756908,"source":"API","sourceVid":[]}]},"closedate":{"value":"1461042000000","ti
mestamp":1456622756908,"source":"API","sourceId":null,"versions":[{"name":"closedate","value":"1461042000000","timestamp":1456622756908,"source":"API","sour
ceVid":[]}]},"hubspot_owner_id":{"value":"72","timestamp":1456622756908,"source":"API","sourceId":null,"versions":[{"name":"hubspot_owner_id","value":"72","
timestamp":1456622756908,"source":"API","sourceVid":[]}]},"hs_lastmodifieddate":{"value":"1456622756943","timestamp":1456622756943,"source":"CALCULATED","so
urceId":null,"versions":[{"name":"hs_lastmodifieddate","value":"1456622756943","timestamp":1456622756943,"source":"CALCULATED","sourceVid":[]}]},"hubspot_ow
ner_assigneddate":{"value":"1456622756908","timestamp":1456622756908,"source":"API","sourceId":null,"versions":[{"name":"hubspot_owner_assigneddate","value"
:"1456622756908","timestamp":1456622756908,"source":"API","sourceVid":[]}]},"num_associated_contacts":{"value":"0","timestamp":0,"source":"CALCULATED","sour
ceId":null,"versions":[{"name":"num_associated_contacts","value":"0","source":"CALCULATED","sourceVid":[]}]},"dealstage":{"value":"qualifiedtobuy","timestam
p":1456622756908,"source":"API","sourceId":null,"versions":[{"name":"dealstage","value":"qualifiedtobuy","timestamp":1456622756908,"source":"API","sourceVid
":[]}]},"hs_createdate":{"value":"1456622756943","timestamp":1456622756943,"source":null,"sourceId":null,"versions":[{"name":"hs_createdate","value":"145662
2756943","timestamp":1456622756943,"sourceVid":[]}]},"createdate":{"value":"1456622756943","timestamp":1456622756943,"source":null,"sourceId":null,"versions
":[{"name":"createdate","value":"1456622756943","timestamp":1456622756943,"sourceVid":[]}]},"dealtype":{"value":"existingbusiness","timestamp":1456622756908
,"source":"API","sourceId":null,"versions":[{"name":"dealtype","value":"existingbusiness","timestamp":1456622756908,"source":"API","sourceVid":[]}]}},"impor
ts":[]}

对于字段的简单提取,你可以这样做:

curl -s https://api.hubapi.com/deals/v1/deal/all?hapikey=demo | \
    jq --compact-output --raw-output \
    '.deals[] | select(.properties.dealstage.value=="qualifiedtobuy") | [.properties.amount.value, .properties.closedate.value, .properties.createdate.value, .properties.dealname.value] | @csv'

这将给出匹配的行,每行包含交易金额、结束日期、创建日期和交易名称,当然以逗号分隔。 --raw-output 标志取消引号转义以使输出更清晰。对于您的测试 url,它给出:

"150000","1461042000000","1456622756943","Google Website"

更多匹配,更多台词。重定向输出并完成你的一天。更少的开发时间,更美好的世界。