尝试使用 htmlagiltypack 加载 html 时出错
Error when try to load html with htmlagiltypack
我正在尝试运行这个代码
string path = "http://warisons.rssing.com/chan1729325/all_p43.html";
HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
htmlDoc.LoadHtml(path);
var div = htmlDoc.DocumentNode.Descendants("div");
foreach (var x in div)
{
Console.WriteLine(x.Attributes["class"].Value);
}
当我在 htmlDoc.LoadHtml(path);
中调试此代码时出现此错误
Locating source for
'd:\SVN_CHECKOUT\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs'.
Checksum: MD5 {4e 14 d3 b d5 30 6e 2c bf 84 ab 8a 96 82 4a 8f} The
file
'd:\SVN_CHECKOUT\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs'
does not exist. Looking in script documents for
'd:\SVN_CHECKOUT\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs'...
Looking in the projects for
'd:\SVN_CHECKOUT\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs'.
The file was not found in a project. Looking in directory 'C:\Program
Files (x86)\Microsoft Visual Studio 12.0\VC\crt\src\'... Looking in
directory 'C:\Program Files (x86)\Microsoft Visual Studio
12.0\VC\crt\src\vccorlib\'... Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\atlmfc\src\mfc\'... Looking in
directory 'C:\Program Files (x86)\Microsoft Visual Studio
12.0\VC\atlmfc\src\atl\'... Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\atlmfc\include'... The debug
source files settings for the active solution indicate that the
debugger will not ask the user to find the file:
d:\SVN_CHECKOUT\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs.
The debugger could not locate the source file
'd:\SVN_CHECKOUT\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs'.
您从 URI 加载 html 文档的尝试不正确。
Methof HtmlDocument.LoadHtml
从提供的字符串加载 html,因此它的参数是 html 文本本身,而不是 URI。
要从提供的 URI 加载 html,您需要如下内容:
string path = "http://warisons.rssing.com/chan1729325/all_p43.html";
HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlWeb().Load(path);
另请注意,您可以在此处获得 NullReferenceException
:
x.Attributes["class"].Value
因为您在访问它的值之前没有检查是否有 class
属性 (x.Attributes["class"] != null
)。
我正在尝试运行这个代码
string path = "http://warisons.rssing.com/chan1729325/all_p43.html";
HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
htmlDoc.LoadHtml(path);
var div = htmlDoc.DocumentNode.Descendants("div");
foreach (var x in div)
{
Console.WriteLine(x.Attributes["class"].Value);
}
当我在 htmlDoc.LoadHtml(path);
中调试此代码时出现此错误
Locating source for 'd:\SVN_CHECKOUT\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs'. Checksum: MD5 {4e 14 d3 b d5 30 6e 2c bf 84 ab 8a 96 82 4a 8f} The file 'd:\SVN_CHECKOUT\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs' does not exist. Looking in script documents for 'd:\SVN_CHECKOUT\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs'... Looking in the projects for 'd:\SVN_CHECKOUT\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs'. The file was not found in a project. Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\crt\src\'... Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\crt\src\vccorlib\'... Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\atlmfc\src\mfc\'... Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\atlmfc\src\atl\'... Looking in directory 'C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\atlmfc\include'... The debug source files settings for the active solution indicate that the debugger will not ask the user to find the file: d:\SVN_CHECKOUT\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs. The debugger could not locate the source file 'd:\SVN_CHECKOUT\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs'.
您从 URI 加载 html 文档的尝试不正确。
Methof HtmlDocument.LoadHtml
从提供的字符串加载 html,因此它的参数是 html 文本本身,而不是 URI。
要从提供的 URI 加载 html,您需要如下内容:
string path = "http://warisons.rssing.com/chan1729325/all_p43.html";
HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlWeb().Load(path);
另请注意,您可以在此处获得 NullReferenceException
:
x.Attributes["class"].Value
因为您在访问它的值之前没有检查是否有 class
属性 (x.Attributes["class"] != null
)。