如何使用单个数据字段的语言检测在 json 中添加数据字段（键值）

Question

我有像

这样的天气警报数据

"alerts": [
    {
        "description": "There is a risk of frost (Level 1 of 2).\nMinimum temperature: ~ -2 \u00b0C",
        "end": 1612522800,
        "event": "frost",
        "sender_name": "DWD / Nationales Warnzentrum Offenbach",
        "start": 1612450800
    },
    {
        "description": "There is a risk of widespread icy surfaces (Level 1 of 3).\ncause: widespread ice formation or traces of snow",
        "end": 1612515600,
        "event": "widespread icy surfaces",
        "sender_name": "DWD / Nationales Warnzentrum Offenbach",
        "start": 1612450800
    },
    {
        "description": "Es treten Windb\u00f6en mit Geschwindigkeiten um 55 km/h (15m/s, 30kn, Bft 7) aus \u00f6stlicher Richtung auf. In exponierten Lagen muss mit Sturmb\u00f6en bis 65 km/h (18m/s, 35kn, Bft 8) gerechnet werden.",
        "end": 1612587600,
        "event": "WINDB\u00d6EN",
        "sender_name": "DWD / Nationales Warnzentrum Offenbach",
        "start": 1612522800
    },

现在我想向每个警报字典添加一个键值对，其中包含来自 'description' 字段的语言检测。我试过了，但无法获得正确的语法...

import json
from langdetect import detect

with open("kiel.json", 'r') as f:
    data = json.loads(f.read())

data['ADDED_KEY'] = 'ADDED_VALUE'
#'ADDED_KEY' = 'lang' - should be added as a data field to EVERY alert
#'ADDED_VALUE' = 'en' or 'ger' - should be the detected language [via detect()] from data field 'description' of every alert 

with open("kiel.json", 'w') as f:
    f.write(json.dumps(data, sort_keys=True, indent=4, separators=(',', ': ')))

实际上我刚刚在整个文件中添加了这样的内容：

{
"ADDED_KEY": "ADDED_VALUE",
"alerts": [
    {
        "description": "There is a risk of frost (Level 1 of 2).\nMinimum temperature: ~ -2 \u00b0C",
        "end": 1612522800,
        "event": "frost",
        "sender_name": "DWD / Nationales Warnzentrum Offenbach",
        "start": 1612450800
    },

你能帮助我以正确的方式完成代码并正确访问正确的数据字段吗？

进一步：

现在情况出现了，'alerts' 没有作为数据字段包含在内（例如，当因为天气好没有传输警报数据时）- 我一直想生成那个 JSON。我试过了：

for item in data['alerts']:
    if 'alerts' not in data:
        continue
else:
    item['lang'] = detect(item['description'])

但是如果没有 'alerts' 数据字段我得到

      for item in data['alerts']:
KeyError: 'alerts'

我该如何解决这个问题？ “继续”不是正确的任务吗？或者我必须更改 if- 和 for-loop 吗？再次感谢！

Answer 1

您只需要遍历字典键 alerts 并将 键，值 添加到每个 item（这是一本字典）。

for item in <b>data["alerts"]</b>:
    item["ADDED_KEY"] = "ADDED_VALUE"

Answer 2

以下作品。遍历 alert 并添加 key/value 正如你提到的那样。

import json
from langdetect import detect

with open("kiel.json", 'r') as f:
    data = json.loads(f.read())

for item in data['alerts']:
    item['lang'] = detect(item['description']) 
#'ADDED_KEY' = 'lang' - should be added as a data field to EVERY alert
#'ADDED_VALUE' = 'en' or 'ger' - should be the detected language [via detect()] from data field 'description' of every alert 

with open("kiel.json", 'w') as f:
    f.write(json.dumps(data, sort_keys=True, indent=4, separators=(',', ': ')))

如何使用单个数据字段的语言检测在 json 中添加数据字段（键值）

How to add datafield (key-value) in json with language detection for single data field

python

json

language-detection