在不失去可读性的情况下将特殊字符转换为 ascii 行字符或其他字符

Turn special characters into ascii-like characters or someting else without losing readability

正在尝试将 ics 日历文件中的数据格式化为任何输出,例如 json 甚至 python print()。寻找在不失去可读性和类似 ascii 字符的情况下替换特殊字符的好方法。下面的例子。有什么建议吗?

ics 文件中的摘要字段值

FORMULA 1 HEINEKEN GRANDE PRÉMIO DE PORTUGAL 2021 - Race
FORMULA 1 MYWORLD GROSSER PREIS VON ÖSTERREICH 2021 - Race

json 文件中的摘要键值

FORMULA 1 HEINEKEN GRANDE PR\u00c3\u0089MIO DE PORTUGAL 2021 - Race
FORMULA 1 MYWORLD GROSSER PREIS VON \u00c3\u0096STERREICH 2021 - Race

重现问题的示例代码

import requests
import json
from icalendar import Calendar

## LOGIC HERE ##
def format_text(text):
    text = str(text)
    return text


url = "http://www.formula1.com/calendar/Formula_1_Official_Calendar.ics"
res = requests.get(url)
calendar = Calendar.from_ical(res.text)
events = [
    {
        "id": event["UID"].split("@")[-1].strip(),
        "startTime": event["DTSTART"].dt.strftime("%Y-%m-%dT%H:%M:%S.%f")[:-3],
        "summary": format_text(event["SUMMARY"])
    } for event in calendar.walk("VEVENT") if str(event["UID"]).split("@")[0].startswith("Race")]


with open("events.json", "w") as f:
    json.dump(events, f, indent=2)
with open("events.json", mode="w", encoding="utf-8") as f:
    json.dump(events, f, indent=2, ensure_ascii=False)

来自json.dump docs

If ensure_ascii is true (the default), the output is guaranteed to have all incoming non-ASCII characters escaped. If ensure_ascii is false, these characters will be output as-is.

open 中使用 encoding="utf-8" 作为 default encoding is platform dependent (whatever locale.getpreferredencoding() returns)

.ics 文件的数据不应该被解码,而是直接传递给 .from_ical。请改用 res.content。然后 Calendar 生成正确解码为 UTF-8 的数据(可能是 .ICS 规范的一部分)并且 print 可以正确打印 Unicode 字符串。对于 JSON,使用 utf8 编码和 ensure_ascii=False 编写,因为 @JosefZ 也建议正确查看它:

import requests
import json
from icalendar import Calendar

url = 'http://www.formula1.com/calendar/Formula_1_Official_Calendar.ics'
res = requests.get(url)
calendar = Calendar.from_ical(res.content)
events = [
    {
        'id': event['UID'].split('@')[-1].strip(),
        'startTime': event['DTSTART'].dt.strftime('%Y-%m-%dT%H:%M:%S.%f')[:-3],
        'summary': event['SUMMARY']
    } for event in calendar.walk('VEVENT') if str(event['UID']).split('@')[0].startswith('Race')]

for event in events:
    print(event['summary'])

with open('events.json', 'w', encoding='utf8') as f:
    json.dump(events, f, ensure_ascii=False, indent=2)

print 输出:

FORMULA 1 GULF AIR BAHRAIN GRAND PRIX 2021 - Race
FORMULA 1 PIRELLI GRAN PREMIO DEL MADE IN ITALY E DELL'EMILIA ROMAGNA 2021 - Race
FORMULA 1 HEINEKEN GRANDE PRÉMIO DE PORTUGAL 2021 - Race
FORMULA 1 ARAMCO GRAN PREMIO DE ESPAÑA 2021 - Race
FORMULA 1 GRAND PRIX DE MONACO 2021 - Race
FORMULA 1 AZERBAIJAN GRAND PRIX 2021 - Race
FORMULA 1 HEINEKEN GRAND PRIX DU CANADA 2021 - Race
FORMULA 1 EMIRATES GRAND PRIX DE FRANCE 2021 - Race
FORMULA 1 MYWORLD GROSSER PREIS VON ÖSTERREICH 2021 - Race
FORMULA 1 PIRELLI BRITISH GRAND PRIX 2021 - Race
FORMULA 1 MAGYAR NAGYDÍJ 2021 - Race
FORMULA 1 ROLEX BELGIAN GRAND PRIX 2021 - Race
FORMULA 1 HEINEKEN DUTCH GRAND PRIX 2021 - Race
FORMULA 1 HEINEKEN GRAN PREMIO D’ITALIA 2021 - Race
FORMULA 1 VTB RUSSIAN GRAND PRIX 2021 - Race
FORMULA 1 SINGAPORE AIRLINES SINGAPORE GRAND PRIX 2021 - Race
FORMULA 1 JAPANESE GRAND PRIX 2021 - Race
FORMULA 1 ARAMCO UNITED STATES GRAND PRIX 2021 - Race
FORMULA 1 GRAN PREMIO DE LA CIUDAD DE MÉXICO 2021 - Race
FORMULA 1 HEINEKEN GRANDE PRÊMIO DE SÃO PAULO 2021 - Race
FORMULA 1 ROLEX AUSTRALIAN GRAND PRIX 2021 - Race
FORMULA 1 SAUDI ARABIAN GRAND PRIX 2021 - Race
FORMULA 1 ETIHAD AIRWAYS ABU DHABI GRAND PRIX 2021 - Race

events.json:

[
  {
    "id": "1064",
    "startTime": "2021-03-28T16:00:00.000",
    "summary": "FORMULA 1 GULF AIR BAHRAIN GRAND PRIX 2021 - Race"
  },
  {
    "id": "1065",
    "startTime": "2021-04-18T14:00:00.000",
    "summary": "FORMULA 1 PIRELLI GRAN PREMIO DEL MADE IN ITALY E DELL'EMILIA ROMAGNA 2021 - Race"
  },
  {
    "id": "1066",
    "startTime": "2021-05-02T15:00:00.000",
    "summary": "FORMULA 1 HEINEKEN GRANDE PRÉMIO DE PORTUGAL 2021 - Race"
  },
  {
    "id": "1086",
    "startTime": "2021-05-09T14:00:00.000",
    "summary": "FORMULA 1 ARAMCO GRAN PREMIO DE ESPAÑA 2021 - Race"
  },
  {
    "id": "1067",
    "startTime": "2021-05-23T14:00:00.000",
    "summary": "FORMULA 1 GRAND PRIX DE MONACO 2021 - Race"
  },
  {
    "id": "1068",
    "startTime": "2021-06-06T13:00:00.000",
    "summary": "FORMULA 1 AZERBAIJAN GRAND PRIX 2021 - Race"
  },
  {
    "id": "1069",
    "startTime": "2021-06-13T19:00:00.000",
    "summary": "FORMULA 1 HEINEKEN GRAND PRIX DU CANADA 2021 - Race"
  },
  {
    "id": "1070",
    "startTime": "2021-06-27T14:00:00.000",
    "summary": "FORMULA 1 EMIRATES GRAND PRIX DE FRANCE 2021 - Race"
  },
  {
    "id": "1071",
    "startTime": "2021-07-04T14:00:00.000",
    "summary": "FORMULA 1 MYWORLD GROSSER PREIS VON ÖSTERREICH 2021 - Race"
  },
  {
    "id": "1072",
    "startTime": "2021-07-18T15:00:00.000",
    "summary": "FORMULA 1 PIRELLI BRITISH GRAND PRIX 2021 - Race"
  },
  {
    "id": "1073",
    "startTime": "2021-08-01T14:00:00.000",
    "summary": "FORMULA 1 MAGYAR NAGYDÍJ 2021 - Race"
  },
  {
    "id": "1074",
    "startTime": "2021-08-29T14:00:00.000",
    "summary": "FORMULA 1 ROLEX BELGIAN GRAND PRIX 2021 - Race"
  },
  {
    "id": "1075",
    "startTime": "2021-09-05T14:00:00.000",
    "summary": "FORMULA 1 HEINEKEN DUTCH GRAND PRIX 2021 - Race"
  },
  {
    "id": "1076",
    "startTime": "2021-09-12T14:00:00.000",
    "summary": "FORMULA 1 HEINEKEN GRAN PREMIO D’ITALIA 2021 - Race"
  },
  {
    "id": "1077",
    "startTime": "2021-09-26T13:00:00.000",
    "summary": "FORMULA 1 VTB RUSSIAN GRAND PRIX 2021 - Race"
  },
  {
    "id": "1078",
    "startTime": "2021-10-03T13:00:00.000",
    "summary": "FORMULA 1 SINGAPORE AIRLINES SINGAPORE GRAND PRIX 2021 - Race"
  },
  {
    "id": "1079",
    "startTime": "2021-10-10T06:00:00.000",
    "summary": "FORMULA 1 JAPANESE GRAND PRIX 2021 - Race"
  },
  {
    "id": "1080",
    "startTime": "2021-10-24T20:00:00.000",
    "summary": "FORMULA 1 ARAMCO UNITED STATES GRAND PRIX 2021 - Race"
  },
  {
    "id": "1081",
    "startTime": "2021-10-31T19:00:00.000",
    "summary": "FORMULA 1 GRAN PREMIO DE LA CIUDAD DE MÉXICO 2021 - Race"
  },
  {
    "id": "1082",
    "startTime": "2021-11-07T17:00:00.000",
    "summary": "FORMULA 1 HEINEKEN GRANDE PRÊMIO DE SÃO PAULO 2021 - Race"
  },
  {
    "id": "1083",
    "startTime": "2021-11-21T06:00:00.000",
    "summary": "FORMULA 1 ROLEX AUSTRALIAN GRAND PRIX 2021 - Race"
  },
  {
    "id": "1085",
    "startTime": "2021-12-05T16:00:00.000",
    "summary": "FORMULA 1 SAUDI ARABIAN GRAND PRIX 2021 - Race"
  },
  {
    "id": "1084",
    "startTime": "2021-12-12T13:00:00.000",
    "summary": "FORMULA 1 ETIHAD AIRWAYS ABU DHABI GRAND PRIX 2021 - Race"
  }
]