遍历列表中的嵌套字典，其中缺少一些键

Question

我需要你们的帮助，了解如何从列表中的嵌套字典中提取信息。这是获取数据的代码：

import requests
import json
import time

all_urls = []
for x in range(5000,5010):
    url = f'https://api.jikan.moe/v4/anime/{x}/full'
    all_urls.append(url)

all_responses = []
for page_url in all_urls:
    response = requests.get(page_url)
    all_responses.append(response)
    time.sleep(1)
    print(all_responses)

data = []
for i in all_responses:
    json_data = json.loads(i.text)
    data.append(json_data)

print(data)

提取的数据样本如下：

[{'status': 404,
  'type': 'BadResponseException',
  'message': 'Resource does not exist',
  'error': '404 on https://myanimelist.net/anime/5000/'},
 {'status': 404,
  'type': 'BadResponseException',
  'message': 'Resource does not exist',
  'error': '404 on https://myanimelist.net/anime/5001/'},
 {'data': {'mal_id': 5002,
   'url': 'https://myanimelist.net/anime/5002/Bari_Bari_Densetsu',
   'images': {'jpg': {'image_url': 'https://cdn.myanimelist.net/images/anime/4/58873.jpg',
     'small_image_url': 'https://cdn.myanimelist.net/images/anime/4/58873t.jpg',
     'large_image_url': 'https://cdn.myanimelist.net/images/anime/4/58873l.jpg'},
    'webp': {'image_url': 'https://cdn.myanimelist.net/images/anime/4/58873.webp',
     'small_image_url': 'https://cdn.myanimelist.net/images/anime/4/58873t.webp',
     'large_image_url': 'https://cdn.myanimelist.net/images/anime/4/58873l.webp'}},
   'trailer': {'youtube_id': None,
    'url': None,
    'embed_url': None,
    'images': {'image_url': None,
     'small_image_url': None,
     'medium_image_url': None,
     'large_image_url': None,
     'maximum_image_url': None}},
   'title': 'Bari Bari Densetsu',
   'title_english': None,
   'title_japanese': 'バリバリ伝説',
   'title_synonyms': ['Baribari Densetsu',
......

我需要从数据列表中提取标题。任何帮助表示赞赏！此外，非常感谢任何关于从 API 中提取 json 数据的 better/simpler/cleaner 代码的建议！

Answer 1

首先，无需创建多个列表。您可以在一个循环中完成所有操作：

import requests
import json

data = []

for x in range(5000,5010):
    page_url = f'https://api.jikan.moe/v4/anime/{x}/full'
    response = requests.get(page_url)
    json_data = json.loads(response.text)
    data.append(json_data)

print(data)

其次，要解决您的问题，您有两种选择。您可以使用 dict.get:

for dic in data:
    title = dic.get('title', 'no title')

或使用 try/except 模式：

for dic in data:
    try:
        title = dic['title']
    except KeyError:
        # deal with case where dict has no title
        pass

遍历列表中的嵌套字典，其中缺少一些键

Iterate through a nested dict inside of a list, with some missing keys

python

json