HtmlAgilityPack return 编码错误的文本
HtmlAgilityPack return Text with wrong Encoding
我在程序中收到这样的短信
Ký Sinh Trùng  - (2019)
哪个是错误的,应该如下
Ký Sinh Trùng - (2019)
我使用了下面的代码但是没有任何反应
byte[] bytes = Encoding.Default.GetBytes(nodes.InnerText);
var myString = Encoding.UTF8.GetString(bytes);
我该如何解决这个问题?
更新:完整代码:
HtmlWeb Webget = new HtmlWeb();
var docx = await Webget.LoadFromWebAsync(@"https://isubtitles.org/search?kwd=parasite");
var items = docx.DocumentNode.SelectNodes("//div[@class='movie-list-info']");
foreach (var node in items)
{
var name = node?.SelectSingleNode(".//div/div[2]/h3/a");
var xxxx = name?.InnerText;
byte[] bytes = Encoding.UTF8.GetBytes(xxxx);
var myString = Encoding.UTF8.GetString(bytes);
Debug.WriteLine(myString);
return;
}
那只是 HTML 编码的文本。没关系。如需解码,则:
System.Net.WebUtility.HtmlDecode(theHtmlEncodedString)
https://docs.microsoft.com/en-us/dotnet/api/system.net.webutility.htmldecode?view=net-5.0
或(如果您已加载 System.Web
):
System.Web.HttpUtility.HtmlDecode(theHtmlEncodedString)
https://docs.microsoft.com/en-us/dotnet/api/system.web.httputility.htmldecode?view=net-5.0
我在程序中收到这样的短信
Ký Sinh Trùng  - (2019)
哪个是错误的,应该如下
Ký Sinh Trùng - (2019)
我使用了下面的代码但是没有任何反应
byte[] bytes = Encoding.Default.GetBytes(nodes.InnerText);
var myString = Encoding.UTF8.GetString(bytes);
我该如何解决这个问题?
更新:完整代码:
HtmlWeb Webget = new HtmlWeb();
var docx = await Webget.LoadFromWebAsync(@"https://isubtitles.org/search?kwd=parasite");
var items = docx.DocumentNode.SelectNodes("//div[@class='movie-list-info']");
foreach (var node in items)
{
var name = node?.SelectSingleNode(".//div/div[2]/h3/a");
var xxxx = name?.InnerText;
byte[] bytes = Encoding.UTF8.GetBytes(xxxx);
var myString = Encoding.UTF8.GetString(bytes);
Debug.WriteLine(myString);
return;
}
那只是 HTML 编码的文本。没关系。如需解码,则:
System.Net.WebUtility.HtmlDecode(theHtmlEncodedString)
https://docs.microsoft.com/en-us/dotnet/api/system.net.webutility.htmldecode?view=net-5.0
或(如果您已加载 System.Web
):
System.Web.HttpUtility.HtmlDecode(theHtmlEncodedString)
https://docs.microsoft.com/en-us/dotnet/api/system.web.httputility.htmldecode?view=net-5.0