Python Imgur API JSON 输出到 CSV

Python Imgur API JSON output to CSV

我对 Python 和一般编码还很陌生,但我不知何故想出了如何将 Imgur API 串在一起给我一个 JSON 输出文件。

我的最终目标是能够将文件与其他一些已收集的数据一起放入 Excel 中,因此我希望能够将 API 输出转换为 CSV。

到目前为止,我唯一的解决方案是获取 JSON 输出并将其放入在线转换器(我已经尝试过其他人在网上所说的,但我从来没有能够让它工作。)

这是 JSON 的示例输出(我很确定它没有任何嵌套部分):

{"status": 200, "data": {"in_gallery": false, "deletehash": "pfSgnqtf9eh4r2B", "layout": "blog", "description": null, "title": null, "cover_height": 177, "views": 0, "privacy": "public", "cover": "P1tTbZw", "images_count": 2, "datetime": 1468959627, "account_url": "JosephL32", "favorite": false, "cover_width": 222, "link": "http://imgur.com/a/3I3H7", "is_ad": false, "section": null, "images": [{"datetime": 1468959628, "bandwidth": 0, "nsfw": null, "vote": null, "id": "P1tTbZw", "account_id": null, "in_gallery": false, "title": null, "section": null, "width": 222, "size": 48248, "type": "image/png", "is_ad": false, "deletehash": "mGqP4DFgDtBZG8Y", "description": null, "views": 0, "link": "http://i.imgur.com/P1tTbZw.png", "height": 177, "name": "Screen Shot 2016-07-19 at 4.20.05 PM", "favorite": false, "account_url": null, "animated": false}, {"datetime": 1468959630, "bandwidth": 0, "nsfw": null, "vote": null, "id": "5zGa1go", "account_id": null, "in_gallery": false, "title": null, "section": null, "width": 221, "size": 74481, "type": "image/png", "is_ad": false, "deletehash": "LnJxl5rltxsIFl2", "description": null, "views": 0, "link": "http://i.imgur.com/5zGa1go.png", "height": 152, "name": "Screen Shot 2016-07-19 at 4.19.59 PM", "favorite": false, "account_url": null, "animated": false}], "nsfw": null, "id": "3I3H7", "account_id": 37918982}, "success": true}

总而言之,我正在寻找 python 代码,我可以在获取 JSON 数据后将其保存为 CSV 文件。

这应该不会太难,因为 JSON 和 CSV 文件结构都可以在 Python 中使用字典相当容易地表示。但是,首先,我应该指出 JSON 数据实际上是嵌套的,如果我们将其格式化得很好,就可以看出这一点:

{
  "status": 200,
  "data":
  {
    "in_gallery": false,
    "deletehash": "pfSgnqtf9eh4r2B",
    "layout": "blog",
    "description": null,
    "title": null,
    "cover_height": 177,
    "views": 0,
    "privacy": "public",
    "cover": "P1tTbZw",
    "images_count": 2,
    "datetime": 1468959627,
    "account_url": "JosephL32",
    "favorite": false,
    "cover_width": 222,
    "link": "http://imgur.com/a/3I3H7",
    "is_ad": false,
    "section": null,
    "images":
    [
      {
        "datetime": 1468959628,
        "bandwidth": 0,
        "nsfw": null,
        "vote": null,
        "id": "P1tTbZw",
        "account_id": null,
        "in_gallery": false,
        "title": null,
        "section": null,
        "width": 222,
        "size": 48248,
        "type": "image/png",
        "is_ad": false,
        "deletehash": "mGqP4DFgDtBZG8Y",
        "description": null,
        "views": 0,
        "link": "http://i.imgur.com/P1tTbZw.png",
        "height": 177,
        "name": "Screen Shot 2016-07-19 at 4.20.05 PM",
        "favorite": false,
        "account_url": null,
        "animated": false
      },
      {
        "datetime": 1468959630,
        "bandwidth": 0,
        "nsfw": null,
        "vote": null,
        "id": "5zGa1go",
        "account_id": null,
        "in_gallery": false,
        "title": null,
        "section": null,
        "width": 221,
        "size": 74481,
        "type": "image/png",
        "is_ad": false,
        "deletehash": "LnJxl5rltxsIFl2",
        "description": null, "views": 0,
        "link": "http://i.imgur.com/5zGa1go.png",
        "height": 152,
        "name": "Screen Shot 2016-07-19 at 4.19.59 PM",
        "favorite": false,
        "account_url": null,
        "animated": false
      }
    ],
    "nsfw": null,
    "id": "3I3H7",
    "account_id": 37918982
  },
  "success": true
}

这个 JSON 的嵌套结构会使这个问题更加棘手,因为它包含两个图像字典,每个图像字典都具有相同的属性集。目前尚不清楚您希望如何格式化最终数据,但我将以输出 CSV 为目标,其中每一行对应相册中的每张图片。

我们需要做的第一件事是将 JSON 字符串转换为 Python 字典:

import json

json_dict = json.loads(raw_json)

现在我们可以访问 json_dict['data'] 中的所有重要数据和 json_dict['data']['images'] 中的相册中的图像列表(在 Python 中表示为字典)。

假设您要为每个图像输出具有 datetimetitlelinkname 属性的 CSV,以及 datetime, title, link, and images_count attributes for the album that each image appears in. 我们将首先打开一个新的 CSV 文件,并在 headers 处写入一行列它的顶部,然后遍历图像并为每个图像写一行。

import csv

# Open new CSV file
with open("output.csv", "w") as csv_file:
    writer = csv.writer(csv_file)

    # Write CSV headers
    writer.writerow(["datetime", "title", "link", "name", "album_datetime",
                     "album_title", "album_link", "album_images_count"])

    # Write data to CSV for each image
    album_data = json_dict['data']
    images_data = json_dict['data']['images']
    for image in images_data:
        writer.writerow([image['datetime'],
                           image['title'],
                           image['link'],
                           image['name'],
                           album_data['title'],
                           album_data['datetime'],
                           album_data['link'],
                           album_data['images_count']])