JSON link 来自 google 的开发人员工具无法在 Python(或浏览器)中使用

JSON link from google developer tools not working in Python (or in browser)

我正在尝试在 https://www.ecoregistry.io/emit-certifications/ra/10

处提取 table 中的数据

使用 google 开发者工具 > 网络选项卡,我可以获取 json link table 的数据存储位置:https://api-front.ecoregistry.io/api/project/10/emitcertifications

我可以手动复制此 json 数据并使用我编写的这段代码提取信息:

import json
import pandas as pd
data = '''PASTE JSON DATA HERE'''
info = json.loads(data)
columns = ['# Certificate', 'Carbon offsets destination', 'Final user', 'Taxpayer subject','Date','Tons delivered']
dat = list()
for x in info['emitcertifications']:
dat.append([x['consecutive'],x['reasonUsingCarbonOffsets'],x['userEnd'],x['passiveSubject'],x['date'],x['quantity']])
df = pd.DataFrame(dat,columns=columns)
df.to_csv('Data.csv')

我想自动化它,这样我就可以直接从 json link: https://api-front.ecoregistry.io/api/project/10/emitcertifications 中提取数据,而不是手动粘贴json 数据在:

data = '''PASTE JSON DATA HERE'''

link 在 python 中甚至直接在浏览器中不起作用:

import requests
import json
url = ('https://api-front.ecoregistry.io/api/project/10/emitcertifications')
response = requests.get(url)
print(json.dumps(info, indent=4))

我得到的错误输出是: {'status': 0, 'codeMessages': [{'codeMessage': 'ERROR_401', 'param': 'invalid', 'message': 'No autorizado'}]}

当我从开发人员工具下载数据时,这本词典有 'status':1,之后所有数据都在那里。

编辑:我尝试将请求 headers 添加到 url 但它仍然不起作用:

import requests
import json
url = ('https://api-front.ecoregistry.io/api/project/10/emitcertifications')
hdrs = {"accept": "application/json","accept-language": "en-IN,en;q=0.9,hi-IN;q=0.8,hi;q=0.7,en-GB;q=0.6,en-US;q=0.5","authorization": "Bearer null", "content-type": "application/json","if-none-match": "W/\"1326f-t9xxnBEIbEANJdito3ai64aPjqA\"", "lng": "en", "platform": "ecoregistry","sec-ch-ua": "\" Not A;Brand\";v=\"99\", \"Chromium\";v=\"100\", \"Google Chrome\";v=\"100\"", "sec-ch-ua-mobile": "?0", "sec-ch-ua-platform": "\"Windows\"", "sec-fetch-dest": "empty","sec-fetch-mode": "cors", "sec-fetch-site": "same-site" }
response = requests.get(url, headers = hdrs)
print(response)
info = response.json()
print(json.dumps(info, indent=4))

print(response) 输出为 '' 而 info = response.json() 给出回溯错误 'Expecting value: line 1 column 1 (char 0)'

有人能给我指出正确的方向吗?

提前致谢!

发表评论作为回答:

api 需要 headers 才能检索数据 是平台:ecoregistry.

import requests as req
import json
req = req.get('https://api-front.ecoregistry.io/api/project/10/emitcertifications', headers={'platform': 'ecoregistry'})
data = json.loads(data)
print(data.keys())
# dict_keys(['status', 'projectSerialYear', 'yearValidation', 'project', 'emitcertifications'])
print(data['emitcertifications'][0].keys())
# dict_keys(['id', 'auth', 'operation', 'typeRemoval', 'consecutive', 'serialInit', 'serialEnd', 'serial', 'passiveSubject', 'passiveSubjectNit', 'isPublicEndUser', 'isAccept', 'isCanceled', 'isCancelProccess', 'isUpdated', 'isKg', 'reasonUsingCarbonOffsetsId', 'reasonUsingCarbonOffsets', 'quantity', 'date', 'nitEnd', 'userEnd'])