将特定 Json 数据抓取到 csv

Scrape specific Json data to a csv

我正在尝试抓取一些 json 数据。前几行 ae 如下,后几行格式相同。 Json数据:

{
  "data": [
    {
      "date": "2011-10-07",
      "f(avg(output_total)/number(100000000))": 50
    },
    {
      "date": "2011-10-08",
      "f(avg(output_total)/number(100000000))": 50
    },
    {
      "date": "2011-10-12",
      "f(avg(output_total)/number(100000000))": 50
    },
    {
      "date": "2011-10-13",
      "f(avg(output_total)/number(100000000))": 54.0515120216902
    },.......]

我愿意将日期及其相关值(如 fi= 或以上,2011-10-07 和 50、2011-10-08 和 50 等)抓取到包含两列的 csv 文件中(日期和值)

我该如何进行? python 可以吗?

这就是我获取 json 数据的方式:

import os
import requests

url='https://api.blockchair.com/litecoin/transactions?a=date,f(avg(output_total)/number(100000000))'

proxies = {}
response = requests.get(url=url, proxies=proxies)
print(response.content)
json = {
  "data": [
    {
      "date": "2011-10-07",
      "f(avg(output_total)/number(100000000))": 50
    },
    {
      "date": "2011-10-08",
      "f(avg(output_total)/number(100000000))": 50
    },
    {
      "date": "2011-10-12",
      "f(avg(output_total)/number(100000000))": 50
    },
    {
      "date": "2011-10-13",
      "f(avg(output_total)/number(100000000))": 54.0515120216902
    }]}

第 1 步:将 json 转换为 Pandas 数据帧

df = pd.DataFrame(json['data'])

第 2 步:根据条件过滤 Df(例如 >>> 值 = 50)

df_filtered = df[(df["f(avg(output_total)/number(100000000))"] == 50)]

第 3 步:将 df 保存到 csv 文件中,然后选择您要在计算机上存储 CSV 文件的位置。

df_filtered.to_csv(r'C:\user\foo\output.csv', index = False)

如果您希望包含索引,则只需删除 index = False

pandas 允许您在几行中解决这个问题:

import pandas as pd
df = pd.DataFrame(json_data['data'])
df.columns = ["date", "value"]
df.to_csv("data.csv", index=False)

你可以试试这个:

import requests
import csv
import pandas as pd

url='https://api.blockchair.com/litecoin/transactions?a=date,f(avg(output_total)/number(100000000))'
csv_name = 'res_values_1.csv'

response = requests.get(url=url).json()
res_data = response.get('data', [])

# Solution using pandas
res_df = pd.DataFrame(res_data)
res_df.rename(columns={'f(avg(output_total)/number(100000000))': 'value'}, inplace=True)

# filter data those value in >= 50
filtered_res_df = res_df[(res_df["value"] >= 50)]
filtered_res_df.to_csv(csv_name, sep=',', encoding='utf-8', index = False)

# Solution using csv
csv_name = 'res_values_2.csv'
headers = ['date', 'value']
with open(csv_name, 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerow(headers)

    for data in res_data:
        values = list(data.values())
        if values[1] >= 50:
            writer.writerow(values)

CSV 输出:

date,value
2011-10-07,50.0
2011-10-08,50.0
2011-10-12,50.0
2011-10-13,54.0515120216902
.
.
.
2021-10-05,346.12752821011594
2021-10-06,293.5061907016782
2021-10-07,333.17665010641673
2021-10-08,332.2437737707938

你可以这样做。

遍历 JSON 字符串,提取您需要的数据,然后将该数据写入 CSV 文件。

import json
import csv
fields = ['Date', 'Value']
filename = 'test.csv'
s = """
{
   "data":[
      {
         "date":"2011-10-07",
         "f(avg(output_total)/number(100000000))":50
      },
      {
         "date":"2011-10-08",
         "f(avg(output_total)/number(100000000))":50
      },
      {
         "date":"2011-10-12",
         "f(avg(output_total)/number(100000000))":50
      },
      {
         "date":"2011-10-13",
         "f(avg(output_total)/number(100000000))":54.0515120216902
      }
   ]
}
"""
x = json.loads(s)
with open(filename, 'w', newline='') as f:
    cw = csv.writer(f)
    cw.writerow(fields)

    for i in x['data']:
        cw.writerow(i.values())

test.csv

Date        Value
07-10-11    50
08-10-11    50
12-10-11    50
13-10-11    54.05151202

如果您只想要一个 CSV 文件而不依赖任何额外的 Python 模块(例如 pandas),那么它非常简单:

import requests
CSV = 'blockchair.csv'
url='https://api.blockchair.com/litecoin/transactions?a=date,f(avg(output_total)/number(100000000))'
with requests.Session() as session:
    response = session.get(url)
    response.raise_for_status()
    with open(CSV, 'w') as csv:
        csv.write('Date,Value\n')
        for d in response.json()['data']:
            for i, v in enumerate(d.values()):
                if i > 0:
                    csv.write(',')
                csv.write(str(v))
            csv.write('\n')