C# StreamReader Encoding.UTF8 不工作

C# StreamReader Encoding.UTF8 not working

我在 Visual studio 中有一个 C# 项目,它下载并解析 XML 包含韩文、中文和其他 unicode 字符的文件。例如,对于名为 Taeyang 的韩国艺术家,它会像这样产生 XML :

<name>태양</name>

但是 returns

<name>??</name>

我试过StreamReader Encoding.Default但结果是

<name>태양</name>

代码:

string address = String.Format("http://musicbrainz.org/ws/2/artist/{0}?inc=url-rels", mbids[ord]);
HttpWebRequest newRequest = WebRequest.Create(address) as HttpWebRequest;
               newRequest.Headers["If-None-Match"] = etagProf;
               newRequest.Headers[HttpRequestHeader.AcceptEncoding] = "gzip";
var response = newRequest.GetResponse();
// Reader
Stream stream = response.GetResponseStream();
StreamReader reader = new StreamReader(stream, Encoding.UTF-8);
string data = reader.ReadToEnd();

和 xml 来源:

<?xml version="1.0" encoding="UTF-8"?>
<metadata xmlns="http://musicbrainz.org/ns/mmd-2.0#">
    <artist type="Person" id="d84e5667-3cbe-4556-b551-9d7e4be95d71">   
        <name>태양</name>
        <sort-name>Taeyang</sort-name><gender>Male</gender>
        <country>KR</country>
        ...........
    </artist>
</metadata>

我很困惑,为什么会这样?伙计,有什么想法吗?

尝试 UTF8 编码

StreamReader sr= new StreamReader(file_name, System.Text.Encoding.UTF8);

使用下面的代码(注意我注释掉了你的两行)

//newRequest.Headers["If-None-Match"] = "d84e5667-3cbe-4556-b551-9d7e4be95d71";
//newRequest.Headers[HttpRequestHeader.AcceptEncoding] = "gzip";

并更改了您的线路:StreamReader(stream, Encoding.UTF-8);

到:StreamReader(stream, Encoding.UTF8);

我在角色方面取得了不错的成绩:

string address = String.Format("http://musicbrainz.org/ws/2/artist/{0}?inc=url-rels","d84e5667-3cbe-4556-b551-9d7e4be95d71");
HttpWebRequest newRequest = WebRequest.Create(address) as HttpWebRequest;
//newRequest.Headers["If-None-Match"] = "d84e5667-3cbe-4556-b551-9d7e4be95d71";
//newRequest.Headers[HttpRequestHeader.AcceptEncoding] = "gzip";
var response = newRequest.GetResponse();
// Reader
Stream stream = response.GetResponseStream();
StreamReader reader = new StreamReader(stream, Encoding.UTF8);
string data = reader.ReadToEnd();
MessageBox.Show(data);

我发现Console.WriteLine()无法清楚地输出unicode。所有 unicode(例如韩文、中文)和除 a-z 和 0-9 之外的所有字符都无法按预期输出,原因 Console.WriteLine() 使用单一字体 Raster Font

但主要问题是关于我的数据库连接,我忘记在我的连接字符串中添加 charset=utf-8