AngleSharp 解析
AngleSharp Parsing
当您没有要使用的 class 名称或 ID 时,无法找到许多使用 AngleSharp 进行解析的示例。
HTML
<span><a href="google.com" title="Google"><span class="icon icon_none"></span></a></span>
<span><a href="bing.com" title="Bing"><span class="icon icon_none"></span></a></span>
<span><a href="yahoo.com" title="Yahoo"><span class="icon icon_none"></span></a></span>
我想从标题为 Bing
的任何 <a>
标签中找到 href
在 Python BeautifulSoup 我会用
item_needed = a_row.find('a', {'title': 'Bing'})
然后抓取href属性
或jQuery
a[title='Bing']
但是,我无法使用 AngleSharp
例如。下面的例子
https://github.com/AngleSharp/AngleSharp/wiki/Examples#getting-certain-elements
c# AngleSharp
var parser = new AngleSharp.Parser.Html.HtmlParser();
var document = parser.Parse(@"<span><a href=""google.com"" title=""Google""><span class=""icon icon_none""></span></a></span>< span >< a href = ""bing.com"" title = ""Bing"" >< span class=""icon icon_none""></span></a></span><span><a href = ""yahoo.com"" title=""Yahoo""><span class=""icon icon_none""></span></a></span>");
//Do something with LINQ
var blueListItemsLinq = document.All.Where(m => m.LocalName == "a" && //stuck);
您的 HTML 标记中似乎存在问题导致 AngleSharp 无法找到目标元素,即尖括号周围的空格:
< span >< a href = ""bing.com"" title = ""Bing"" >< span class=""icon icon_none"">
修复 HTML 后,LINQ 和 CSS select 或成功 select 目标 link :
var parser = new AngleSharp.Parser.Html.HtmlParser();
var document = parser.ParseDocument(@"<span><a href=""google.com"" title=""Google""><span class=""icon icon_none""></span></a></span><span><a href = ""bing.com"" title = ""Bing""><span class=""icon icon_none""></span></a></span><span><a href = ""yahoo.com"" title=""Yahoo""><span class=""icon icon_none""></span></a></span>");
//LINQ example
var blueListItemsLinq = document.All
.Where(m => m.LocalName == "a" &&
m.GetAttribute("title") == "Bing"
);
//LINQ equivalent CSS selector example
var blueListItemsCSS = document.QuerySelectorAll("a[title='Bing']");
//print href attributes value to console
foreach (var item in blueListItemsCSS)
{
Console.WriteLine(item.GetAttribute("href"));
}
当您没有要使用的 class 名称或 ID 时,无法找到许多使用 AngleSharp 进行解析的示例。
HTML
<span><a href="google.com" title="Google"><span class="icon icon_none"></span></a></span>
<span><a href="bing.com" title="Bing"><span class="icon icon_none"></span></a></span>
<span><a href="yahoo.com" title="Yahoo"><span class="icon icon_none"></span></a></span>
我想从标题为 Bing
的任何<a>
标签中找到 href
在 Python BeautifulSoup 我会用
item_needed = a_row.find('a', {'title': 'Bing'})
然后抓取href属性
或jQuery
a[title='Bing']
但是,我无法使用 AngleSharp 例如。下面的例子 https://github.com/AngleSharp/AngleSharp/wiki/Examples#getting-certain-elements
c# AngleSharp
var parser = new AngleSharp.Parser.Html.HtmlParser();
var document = parser.Parse(@"<span><a href=""google.com"" title=""Google""><span class=""icon icon_none""></span></a></span>< span >< a href = ""bing.com"" title = ""Bing"" >< span class=""icon icon_none""></span></a></span><span><a href = ""yahoo.com"" title=""Yahoo""><span class=""icon icon_none""></span></a></span>");
//Do something with LINQ
var blueListItemsLinq = document.All.Where(m => m.LocalName == "a" && //stuck);
您的 HTML 标记中似乎存在问题导致 AngleSharp 无法找到目标元素,即尖括号周围的空格:
< span >< a href = ""bing.com"" title = ""Bing"" >< span class=""icon icon_none"">
修复 HTML 后,LINQ 和 CSS select 或成功 select 目标 link :
var parser = new AngleSharp.Parser.Html.HtmlParser();
var document = parser.ParseDocument(@"<span><a href=""google.com"" title=""Google""><span class=""icon icon_none""></span></a></span><span><a href = ""bing.com"" title = ""Bing""><span class=""icon icon_none""></span></a></span><span><a href = ""yahoo.com"" title=""Yahoo""><span class=""icon icon_none""></span></a></span>");
//LINQ example
var blueListItemsLinq = document.All
.Where(m => m.LocalName == "a" &&
m.GetAttribute("title") == "Bing"
);
//LINQ equivalent CSS selector example
var blueListItemsCSS = document.QuerySelectorAll("a[title='Bing']");
//print href attributes value to console
foreach (var item in blueListItemsCSS)
{
Console.WriteLine(item.GetAttribute("href"));
}