在不失去可读性的情况下将特殊字符转换为 ascii 行字符或其他字符
Turn special characters into ascii-like characters or someting else without losing readability
正在尝试将 ics 日历文件中的数据格式化为任何输出,例如 json 甚至 python print()
。寻找在不失去可读性和类似 ascii 字符的情况下替换特殊字符的好方法。下面的例子。有什么建议吗?
ics 文件中的摘要字段值
FORMULA 1 HEINEKEN GRANDE PRÉMIO DE PORTUGAL 2021 - Race
FORMULA 1 MYWORLD GROSSER PREIS VON ÖSTERREICH 2021 - Race
json 文件中的摘要键值
FORMULA 1 HEINEKEN GRANDE PR\u00c3\u0089MIO DE PORTUGAL 2021 - Race
FORMULA 1 MYWORLD GROSSER PREIS VON \u00c3\u0096STERREICH 2021 - Race
重现问题的示例代码
import requests
import json
from icalendar import Calendar
## LOGIC HERE ##
def format_text(text):
text = str(text)
return text
url = "http://www.formula1.com/calendar/Formula_1_Official_Calendar.ics"
res = requests.get(url)
calendar = Calendar.from_ical(res.text)
events = [
{
"id": event["UID"].split("@")[-1].strip(),
"startTime": event["DTSTART"].dt.strftime("%Y-%m-%dT%H:%M:%S.%f")[:-3],
"summary": format_text(event["SUMMARY"])
} for event in calendar.walk("VEVENT") if str(event["UID"]).split("@")[0].startswith("Race")]
with open("events.json", "w") as f:
json.dump(events, f, indent=2)
with open("events.json", mode="w", encoding="utf-8") as f:
json.dump(events, f, indent=2, ensure_ascii=False)
If ensure_ascii
is true (the default), the output is guaranteed to
have all incoming non-ASCII characters escaped. If ensure_ascii
is
false, these characters will be output as-is.
在 open
中使用 encoding="utf-8"
作为 default encoding is platform dependent (whatever locale.getpreferredencoding()
returns)。
.ics 文件的数据不应该被解码,而是直接传递给 .from_ical
。请改用 res.content
。然后 Calendar
生成正确解码为 UTF-8 的数据(可能是 .ICS 规范的一部分)并且 print
可以正确打印 Unicode 字符串。对于 JSON,使用 utf8
编码和 ensure_ascii=False
编写,因为 @JosefZ 也建议正确查看它:
import requests
import json
from icalendar import Calendar
url = 'http://www.formula1.com/calendar/Formula_1_Official_Calendar.ics'
res = requests.get(url)
calendar = Calendar.from_ical(res.content)
events = [
{
'id': event['UID'].split('@')[-1].strip(),
'startTime': event['DTSTART'].dt.strftime('%Y-%m-%dT%H:%M:%S.%f')[:-3],
'summary': event['SUMMARY']
} for event in calendar.walk('VEVENT') if str(event['UID']).split('@')[0].startswith('Race')]
for event in events:
print(event['summary'])
with open('events.json', 'w', encoding='utf8') as f:
json.dump(events, f, ensure_ascii=False, indent=2)
print
输出:
FORMULA 1 GULF AIR BAHRAIN GRAND PRIX 2021 - Race
FORMULA 1 PIRELLI GRAN PREMIO DEL MADE IN ITALY E DELL'EMILIA ROMAGNA 2021 - Race
FORMULA 1 HEINEKEN GRANDE PRÉMIO DE PORTUGAL 2021 - Race
FORMULA 1 ARAMCO GRAN PREMIO DE ESPAÑA 2021 - Race
FORMULA 1 GRAND PRIX DE MONACO 2021 - Race
FORMULA 1 AZERBAIJAN GRAND PRIX 2021 - Race
FORMULA 1 HEINEKEN GRAND PRIX DU CANADA 2021 - Race
FORMULA 1 EMIRATES GRAND PRIX DE FRANCE 2021 - Race
FORMULA 1 MYWORLD GROSSER PREIS VON ÖSTERREICH 2021 - Race
FORMULA 1 PIRELLI BRITISH GRAND PRIX 2021 - Race
FORMULA 1 MAGYAR NAGYDÍJ 2021 - Race
FORMULA 1 ROLEX BELGIAN GRAND PRIX 2021 - Race
FORMULA 1 HEINEKEN DUTCH GRAND PRIX 2021 - Race
FORMULA 1 HEINEKEN GRAN PREMIO D’ITALIA 2021 - Race
FORMULA 1 VTB RUSSIAN GRAND PRIX 2021 - Race
FORMULA 1 SINGAPORE AIRLINES SINGAPORE GRAND PRIX 2021 - Race
FORMULA 1 JAPANESE GRAND PRIX 2021 - Race
FORMULA 1 ARAMCO UNITED STATES GRAND PRIX 2021 - Race
FORMULA 1 GRAN PREMIO DE LA CIUDAD DE MÉXICO 2021 - Race
FORMULA 1 HEINEKEN GRANDE PRÊMIO DE SÃO PAULO 2021 - Race
FORMULA 1 ROLEX AUSTRALIAN GRAND PRIX 2021 - Race
FORMULA 1 SAUDI ARABIAN GRAND PRIX 2021 - Race
FORMULA 1 ETIHAD AIRWAYS ABU DHABI GRAND PRIX 2021 - Race
events.json:
[
{
"id": "1064",
"startTime": "2021-03-28T16:00:00.000",
"summary": "FORMULA 1 GULF AIR BAHRAIN GRAND PRIX 2021 - Race"
},
{
"id": "1065",
"startTime": "2021-04-18T14:00:00.000",
"summary": "FORMULA 1 PIRELLI GRAN PREMIO DEL MADE IN ITALY E DELL'EMILIA ROMAGNA 2021 - Race"
},
{
"id": "1066",
"startTime": "2021-05-02T15:00:00.000",
"summary": "FORMULA 1 HEINEKEN GRANDE PRÉMIO DE PORTUGAL 2021 - Race"
},
{
"id": "1086",
"startTime": "2021-05-09T14:00:00.000",
"summary": "FORMULA 1 ARAMCO GRAN PREMIO DE ESPAÑA 2021 - Race"
},
{
"id": "1067",
"startTime": "2021-05-23T14:00:00.000",
"summary": "FORMULA 1 GRAND PRIX DE MONACO 2021 - Race"
},
{
"id": "1068",
"startTime": "2021-06-06T13:00:00.000",
"summary": "FORMULA 1 AZERBAIJAN GRAND PRIX 2021 - Race"
},
{
"id": "1069",
"startTime": "2021-06-13T19:00:00.000",
"summary": "FORMULA 1 HEINEKEN GRAND PRIX DU CANADA 2021 - Race"
},
{
"id": "1070",
"startTime": "2021-06-27T14:00:00.000",
"summary": "FORMULA 1 EMIRATES GRAND PRIX DE FRANCE 2021 - Race"
},
{
"id": "1071",
"startTime": "2021-07-04T14:00:00.000",
"summary": "FORMULA 1 MYWORLD GROSSER PREIS VON ÖSTERREICH 2021 - Race"
},
{
"id": "1072",
"startTime": "2021-07-18T15:00:00.000",
"summary": "FORMULA 1 PIRELLI BRITISH GRAND PRIX 2021 - Race"
},
{
"id": "1073",
"startTime": "2021-08-01T14:00:00.000",
"summary": "FORMULA 1 MAGYAR NAGYDÍJ 2021 - Race"
},
{
"id": "1074",
"startTime": "2021-08-29T14:00:00.000",
"summary": "FORMULA 1 ROLEX BELGIAN GRAND PRIX 2021 - Race"
},
{
"id": "1075",
"startTime": "2021-09-05T14:00:00.000",
"summary": "FORMULA 1 HEINEKEN DUTCH GRAND PRIX 2021 - Race"
},
{
"id": "1076",
"startTime": "2021-09-12T14:00:00.000",
"summary": "FORMULA 1 HEINEKEN GRAN PREMIO D’ITALIA 2021 - Race"
},
{
"id": "1077",
"startTime": "2021-09-26T13:00:00.000",
"summary": "FORMULA 1 VTB RUSSIAN GRAND PRIX 2021 - Race"
},
{
"id": "1078",
"startTime": "2021-10-03T13:00:00.000",
"summary": "FORMULA 1 SINGAPORE AIRLINES SINGAPORE GRAND PRIX 2021 - Race"
},
{
"id": "1079",
"startTime": "2021-10-10T06:00:00.000",
"summary": "FORMULA 1 JAPANESE GRAND PRIX 2021 - Race"
},
{
"id": "1080",
"startTime": "2021-10-24T20:00:00.000",
"summary": "FORMULA 1 ARAMCO UNITED STATES GRAND PRIX 2021 - Race"
},
{
"id": "1081",
"startTime": "2021-10-31T19:00:00.000",
"summary": "FORMULA 1 GRAN PREMIO DE LA CIUDAD DE MÉXICO 2021 - Race"
},
{
"id": "1082",
"startTime": "2021-11-07T17:00:00.000",
"summary": "FORMULA 1 HEINEKEN GRANDE PRÊMIO DE SÃO PAULO 2021 - Race"
},
{
"id": "1083",
"startTime": "2021-11-21T06:00:00.000",
"summary": "FORMULA 1 ROLEX AUSTRALIAN GRAND PRIX 2021 - Race"
},
{
"id": "1085",
"startTime": "2021-12-05T16:00:00.000",
"summary": "FORMULA 1 SAUDI ARABIAN GRAND PRIX 2021 - Race"
},
{
"id": "1084",
"startTime": "2021-12-12T13:00:00.000",
"summary": "FORMULA 1 ETIHAD AIRWAYS ABU DHABI GRAND PRIX 2021 - Race"
}
]
正在尝试将 ics 日历文件中的数据格式化为任何输出,例如 json 甚至 python print()
。寻找在不失去可读性和类似 ascii 字符的情况下替换特殊字符的好方法。下面的例子。有什么建议吗?
ics 文件中的摘要字段值
FORMULA 1 HEINEKEN GRANDE PRÉMIO DE PORTUGAL 2021 - Race
FORMULA 1 MYWORLD GROSSER PREIS VON ÖSTERREICH 2021 - Race
json 文件中的摘要键值
FORMULA 1 HEINEKEN GRANDE PR\u00c3\u0089MIO DE PORTUGAL 2021 - Race
FORMULA 1 MYWORLD GROSSER PREIS VON \u00c3\u0096STERREICH 2021 - Race
重现问题的示例代码
import requests
import json
from icalendar import Calendar
## LOGIC HERE ##
def format_text(text):
text = str(text)
return text
url = "http://www.formula1.com/calendar/Formula_1_Official_Calendar.ics"
res = requests.get(url)
calendar = Calendar.from_ical(res.text)
events = [
{
"id": event["UID"].split("@")[-1].strip(),
"startTime": event["DTSTART"].dt.strftime("%Y-%m-%dT%H:%M:%S.%f")[:-3],
"summary": format_text(event["SUMMARY"])
} for event in calendar.walk("VEVENT") if str(event["UID"]).split("@")[0].startswith("Race")]
with open("events.json", "w") as f:
json.dump(events, f, indent=2)
with open("events.json", mode="w", encoding="utf-8") as f:
json.dump(events, f, indent=2, ensure_ascii=False)
If
ensure_ascii
is true (the default), the output is guaranteed to have all incoming non-ASCII characters escaped. Ifensure_ascii
is false, these characters will be output as-is.
在 open
中使用 encoding="utf-8"
作为 default encoding is platform dependent (whatever locale.getpreferredencoding()
returns)。
.ics 文件的数据不应该被解码,而是直接传递给 .from_ical
。请改用 res.content
。然后 Calendar
生成正确解码为 UTF-8 的数据(可能是 .ICS 规范的一部分)并且 print
可以正确打印 Unicode 字符串。对于 JSON,使用 utf8
编码和 ensure_ascii=False
编写,因为 @JosefZ 也建议正确查看它:
import requests
import json
from icalendar import Calendar
url = 'http://www.formula1.com/calendar/Formula_1_Official_Calendar.ics'
res = requests.get(url)
calendar = Calendar.from_ical(res.content)
events = [
{
'id': event['UID'].split('@')[-1].strip(),
'startTime': event['DTSTART'].dt.strftime('%Y-%m-%dT%H:%M:%S.%f')[:-3],
'summary': event['SUMMARY']
} for event in calendar.walk('VEVENT') if str(event['UID']).split('@')[0].startswith('Race')]
for event in events:
print(event['summary'])
with open('events.json', 'w', encoding='utf8') as f:
json.dump(events, f, ensure_ascii=False, indent=2)
print
输出:
FORMULA 1 GULF AIR BAHRAIN GRAND PRIX 2021 - Race
FORMULA 1 PIRELLI GRAN PREMIO DEL MADE IN ITALY E DELL'EMILIA ROMAGNA 2021 - Race
FORMULA 1 HEINEKEN GRANDE PRÉMIO DE PORTUGAL 2021 - Race
FORMULA 1 ARAMCO GRAN PREMIO DE ESPAÑA 2021 - Race
FORMULA 1 GRAND PRIX DE MONACO 2021 - Race
FORMULA 1 AZERBAIJAN GRAND PRIX 2021 - Race
FORMULA 1 HEINEKEN GRAND PRIX DU CANADA 2021 - Race
FORMULA 1 EMIRATES GRAND PRIX DE FRANCE 2021 - Race
FORMULA 1 MYWORLD GROSSER PREIS VON ÖSTERREICH 2021 - Race
FORMULA 1 PIRELLI BRITISH GRAND PRIX 2021 - Race
FORMULA 1 MAGYAR NAGYDÍJ 2021 - Race
FORMULA 1 ROLEX BELGIAN GRAND PRIX 2021 - Race
FORMULA 1 HEINEKEN DUTCH GRAND PRIX 2021 - Race
FORMULA 1 HEINEKEN GRAN PREMIO D’ITALIA 2021 - Race
FORMULA 1 VTB RUSSIAN GRAND PRIX 2021 - Race
FORMULA 1 SINGAPORE AIRLINES SINGAPORE GRAND PRIX 2021 - Race
FORMULA 1 JAPANESE GRAND PRIX 2021 - Race
FORMULA 1 ARAMCO UNITED STATES GRAND PRIX 2021 - Race
FORMULA 1 GRAN PREMIO DE LA CIUDAD DE MÉXICO 2021 - Race
FORMULA 1 HEINEKEN GRANDE PRÊMIO DE SÃO PAULO 2021 - Race
FORMULA 1 ROLEX AUSTRALIAN GRAND PRIX 2021 - Race
FORMULA 1 SAUDI ARABIAN GRAND PRIX 2021 - Race
FORMULA 1 ETIHAD AIRWAYS ABU DHABI GRAND PRIX 2021 - Race
events.json:
[
{
"id": "1064",
"startTime": "2021-03-28T16:00:00.000",
"summary": "FORMULA 1 GULF AIR BAHRAIN GRAND PRIX 2021 - Race"
},
{
"id": "1065",
"startTime": "2021-04-18T14:00:00.000",
"summary": "FORMULA 1 PIRELLI GRAN PREMIO DEL MADE IN ITALY E DELL'EMILIA ROMAGNA 2021 - Race"
},
{
"id": "1066",
"startTime": "2021-05-02T15:00:00.000",
"summary": "FORMULA 1 HEINEKEN GRANDE PRÉMIO DE PORTUGAL 2021 - Race"
},
{
"id": "1086",
"startTime": "2021-05-09T14:00:00.000",
"summary": "FORMULA 1 ARAMCO GRAN PREMIO DE ESPAÑA 2021 - Race"
},
{
"id": "1067",
"startTime": "2021-05-23T14:00:00.000",
"summary": "FORMULA 1 GRAND PRIX DE MONACO 2021 - Race"
},
{
"id": "1068",
"startTime": "2021-06-06T13:00:00.000",
"summary": "FORMULA 1 AZERBAIJAN GRAND PRIX 2021 - Race"
},
{
"id": "1069",
"startTime": "2021-06-13T19:00:00.000",
"summary": "FORMULA 1 HEINEKEN GRAND PRIX DU CANADA 2021 - Race"
},
{
"id": "1070",
"startTime": "2021-06-27T14:00:00.000",
"summary": "FORMULA 1 EMIRATES GRAND PRIX DE FRANCE 2021 - Race"
},
{
"id": "1071",
"startTime": "2021-07-04T14:00:00.000",
"summary": "FORMULA 1 MYWORLD GROSSER PREIS VON ÖSTERREICH 2021 - Race"
},
{
"id": "1072",
"startTime": "2021-07-18T15:00:00.000",
"summary": "FORMULA 1 PIRELLI BRITISH GRAND PRIX 2021 - Race"
},
{
"id": "1073",
"startTime": "2021-08-01T14:00:00.000",
"summary": "FORMULA 1 MAGYAR NAGYDÍJ 2021 - Race"
},
{
"id": "1074",
"startTime": "2021-08-29T14:00:00.000",
"summary": "FORMULA 1 ROLEX BELGIAN GRAND PRIX 2021 - Race"
},
{
"id": "1075",
"startTime": "2021-09-05T14:00:00.000",
"summary": "FORMULA 1 HEINEKEN DUTCH GRAND PRIX 2021 - Race"
},
{
"id": "1076",
"startTime": "2021-09-12T14:00:00.000",
"summary": "FORMULA 1 HEINEKEN GRAN PREMIO D’ITALIA 2021 - Race"
},
{
"id": "1077",
"startTime": "2021-09-26T13:00:00.000",
"summary": "FORMULA 1 VTB RUSSIAN GRAND PRIX 2021 - Race"
},
{
"id": "1078",
"startTime": "2021-10-03T13:00:00.000",
"summary": "FORMULA 1 SINGAPORE AIRLINES SINGAPORE GRAND PRIX 2021 - Race"
},
{
"id": "1079",
"startTime": "2021-10-10T06:00:00.000",
"summary": "FORMULA 1 JAPANESE GRAND PRIX 2021 - Race"
},
{
"id": "1080",
"startTime": "2021-10-24T20:00:00.000",
"summary": "FORMULA 1 ARAMCO UNITED STATES GRAND PRIX 2021 - Race"
},
{
"id": "1081",
"startTime": "2021-10-31T19:00:00.000",
"summary": "FORMULA 1 GRAN PREMIO DE LA CIUDAD DE MÉXICO 2021 - Race"
},
{
"id": "1082",
"startTime": "2021-11-07T17:00:00.000",
"summary": "FORMULA 1 HEINEKEN GRANDE PRÊMIO DE SÃO PAULO 2021 - Race"
},
{
"id": "1083",
"startTime": "2021-11-21T06:00:00.000",
"summary": "FORMULA 1 ROLEX AUSTRALIAN GRAND PRIX 2021 - Race"
},
{
"id": "1085",
"startTime": "2021-12-05T16:00:00.000",
"summary": "FORMULA 1 SAUDI ARABIAN GRAND PRIX 2021 - Race"
},
{
"id": "1084",
"startTime": "2021-12-12T13:00:00.000",
"summary": "FORMULA 1 ETIHAD AIRWAYS ABU DHABI GRAND PRIX 2021 - Race"
}
]