文件写入:在完成的文件中显示Unicode字符?
File writing: display Unicode characters in finished file?
我正在尝试创建闪存卡来记住日语 kanji characters, and for that I'm crawling Jitenon,这是一个包含大量汉字定义、发音和含义的网站。我已经编写了 类 代码,其中包含可以在每个汉字页面上找到的相关信息,我目前正在尝试将我的汉字列表保存为 json
文件。
出于测试目的,我正在尝试像这样解析单个汉字对象:
...
var kanji = scraper.GetKanjiDefinition(kanjiUrl);
var jsonOptions = new JsonSerializerOptions() { WriteIndented = true };
var kanjiJson = JsonSerializer.Serialize(kanji, jsonOptions);
File.WriteAllText("kanji_json.json", kanjiJson, Encoding.UTF8);
这是我用相应的 json
序列化抓取的 example page:
{
"Character": "\u697D",
"MainRadical": "\u6728",
"Strokes": 13,
"KankenLevel": "\uFF19\u7D1A",
"Education": "\u5C0F\u5B66\u6821\uFF12\u5E74\u751F",
"Meanings": [
{
"Indices": [
"\u301C"
],
"Meaning": "\u304A\u3093\u304C\u304F\u3002",
"JapanTypical": false
},
{
"Indices": [
"\u301C"
],
"Meaning": "\u304B\u306A\u3067\u308B\u3002\u97F3\u3092\u304B\u306A\u3067\u308B\u3002\u6F14\u594F\u3059\u308B\u3002",
"JapanTypical": false
},
{
"Indices": [
"\u301C"
],
"Meaning": "\u305F\u306E\u3057\u3044\u3002\u305F\u306E\u3057\u3080\u3002\u3088\u308D\u3053\u3076\u3002",
"JapanTypical": false
},
{
"Indices": [
"\u301C"
],
"Meaning": "\u3053\u306E\u3080\u3002\u611B\u3059\u308B\u3002\u306D\u304C\u3046\u3002\u6C42\u3081\u308B\u3002",
"JapanTypical": false
},
{
"Indices": [],
"Meaning": "\u65E5\u672C\u3089\u304F\u3002\u305F\u3084\u3059\u3044\u3002\u5FC3\u8EAB\u306B\u82E6\u75DB\u304C\u306A\u304F\u3001\u306E\u3073\u306E\u3073\u3059\u308B\u3002",
"JapanTypical": true
}
],
"Readings": [
{
"Reading": "\u30AC\u30AF",
"Yomi": 0,
"MeaningIndices": [
1
],
"Education": "\u5C0F"
},
{
"Reading": "\u30E9\u30AF",
"Yomi": 0,
"MeaningIndices": [
2
],
"Education": "\u5C0F"
},
{
"Reading": "\u30AE\u30E7\u30A6",
"Yomi": 0,
"MeaningIndices": [
3
],
"Education": "\u5C0F"
},
{
"Reading": "\u30B4\u30A6",
"Yomi": 0,
"MeaningIndices": [
3
],
"Education": "\u5C0F"
},
{
"Reading": "\u305F\u306E\uFF08\u3057\u3044\uFF09",
"Yomi": 1,
"MeaningIndices": [],
"Education": "\u5C0F"
},
{
"Reading": "\u305F\u306E\uFF08\u3057\u3080\uFF09",
"Yomi": 1,
"MeaningIndices": [],
"Education": "\u5C0F"
},
{
"Reading": "\u304B\u306A\uFF08\u3067\u308B\uFF09",
"Yomi": 1,
"MeaningIndices": [],
"Education": "\u5C0F"
},
{
"Reading": "\u3053\u306E\uFF08\u3080\uFF09",
"Yomi": 1,
"MeaningIndices": [],
"Education": "\u5C0F"
}
]
}
我希望 json
文件中包含实际的日语文本,例如 "Character": "楽"
和 "MainRadical": "木"
以及 "KankenLevel": "9級"
而不是转义的 Unicode 字符喜欢 \u697D
。我如何在 .NET 中实现这一点?
如果有什么不同,我在 Ubuntu 20.04 LTS 上,我在 VS Code 1.56.2 中打开我的 json
文件。
在您的 jsonOptions 中设置 Encoder:
var jsonOptions = new JsonSerializerOptions() {
Encoder = JavaScriptEncoder.Create(UnicodeRanges.All),
WriteIndented = true
};
上面的允许所有UnicodeRanges
我正在尝试创建闪存卡来记住日语 kanji characters, and for that I'm crawling Jitenon,这是一个包含大量汉字定义、发音和含义的网站。我已经编写了 类 代码,其中包含可以在每个汉字页面上找到的相关信息,我目前正在尝试将我的汉字列表保存为 json
文件。
出于测试目的,我正在尝试像这样解析单个汉字对象:
...
var kanji = scraper.GetKanjiDefinition(kanjiUrl);
var jsonOptions = new JsonSerializerOptions() { WriteIndented = true };
var kanjiJson = JsonSerializer.Serialize(kanji, jsonOptions);
File.WriteAllText("kanji_json.json", kanjiJson, Encoding.UTF8);
这是我用相应的 json
序列化抓取的 example page:
{
"Character": "\u697D",
"MainRadical": "\u6728",
"Strokes": 13,
"KankenLevel": "\uFF19\u7D1A",
"Education": "\u5C0F\u5B66\u6821\uFF12\u5E74\u751F",
"Meanings": [
{
"Indices": [
"\u301C"
],
"Meaning": "\u304A\u3093\u304C\u304F\u3002",
"JapanTypical": false
},
{
"Indices": [
"\u301C"
],
"Meaning": "\u304B\u306A\u3067\u308B\u3002\u97F3\u3092\u304B\u306A\u3067\u308B\u3002\u6F14\u594F\u3059\u308B\u3002",
"JapanTypical": false
},
{
"Indices": [
"\u301C"
],
"Meaning": "\u305F\u306E\u3057\u3044\u3002\u305F\u306E\u3057\u3080\u3002\u3088\u308D\u3053\u3076\u3002",
"JapanTypical": false
},
{
"Indices": [
"\u301C"
],
"Meaning": "\u3053\u306E\u3080\u3002\u611B\u3059\u308B\u3002\u306D\u304C\u3046\u3002\u6C42\u3081\u308B\u3002",
"JapanTypical": false
},
{
"Indices": [],
"Meaning": "\u65E5\u672C\u3089\u304F\u3002\u305F\u3084\u3059\u3044\u3002\u5FC3\u8EAB\u306B\u82E6\u75DB\u304C\u306A\u304F\u3001\u306E\u3073\u306E\u3073\u3059\u308B\u3002",
"JapanTypical": true
}
],
"Readings": [
{
"Reading": "\u30AC\u30AF",
"Yomi": 0,
"MeaningIndices": [
1
],
"Education": "\u5C0F"
},
{
"Reading": "\u30E9\u30AF",
"Yomi": 0,
"MeaningIndices": [
2
],
"Education": "\u5C0F"
},
{
"Reading": "\u30AE\u30E7\u30A6",
"Yomi": 0,
"MeaningIndices": [
3
],
"Education": "\u5C0F"
},
{
"Reading": "\u30B4\u30A6",
"Yomi": 0,
"MeaningIndices": [
3
],
"Education": "\u5C0F"
},
{
"Reading": "\u305F\u306E\uFF08\u3057\u3044\uFF09",
"Yomi": 1,
"MeaningIndices": [],
"Education": "\u5C0F"
},
{
"Reading": "\u305F\u306E\uFF08\u3057\u3080\uFF09",
"Yomi": 1,
"MeaningIndices": [],
"Education": "\u5C0F"
},
{
"Reading": "\u304B\u306A\uFF08\u3067\u308B\uFF09",
"Yomi": 1,
"MeaningIndices": [],
"Education": "\u5C0F"
},
{
"Reading": "\u3053\u306E\uFF08\u3080\uFF09",
"Yomi": 1,
"MeaningIndices": [],
"Education": "\u5C0F"
}
]
}
我希望 json
文件中包含实际的日语文本,例如 "Character": "楽"
和 "MainRadical": "木"
以及 "KankenLevel": "9級"
而不是转义的 Unicode 字符喜欢 \u697D
。我如何在 .NET 中实现这一点?
如果有什么不同,我在 Ubuntu 20.04 LTS 上,我在 VS Code 1.56.2 中打开我的 json
文件。
在您的 jsonOptions 中设置 Encoder:
var jsonOptions = new JsonSerializerOptions() {
Encoder = JavaScriptEncoder.Create(UnicodeRanges.All),
WriteIndented = true
};
上面的允许所有UnicodeRanges