HtmlAgilityPack 在没有 children 的 Innertext 的情况下检索 innerText
HtmlAgilityPack retrieve innerText without children's Innertext
我有
<p class="MyClass">
<span>Value:</span>
12345
</p>
我只想检索 12345
,如果可能的话,谢谢
不确定是否是最优雅的解决方案,但是
using HtmlAgilityPack;
using System;
using ScrapySharp.Extensions;
using System.Linq;
using HtmlAgilityPack.CssSelectors.NetCore;
namespace Whosebug
{
class Program
{
static void Main(string[] args)
{
var doc = new HtmlDocument();
doc.LoadHtml(@"<p class='MyClass'>
<span>Value:</span>
12345
</p>");
//Using ScrapySharp.Extensions
var p = doc.DocumentNode.CssSelect("p")?.FirstOrDefault();
var span = p.CssSelect("span")?.FirstOrDefault();
Console.WriteLine(p.InnerText.Replace(span.InnerHtml, string.Empty)?.Trim());
//Using HtmlAgilityPack.CssSelectors.NetCore
var results = doc.QuerySelectorAll("p")?.Select(p => p.InnerText.Replace(p.QuerySelector("span").InnerHtml, string.Empty)?.Trim());
foreach(var result in results)
Console.WriteLine(result);
}
}
}
P.S.:我习惯将 ScrapySharp 与 HtmlAgilityPack 结合使用,但看到有一个 HtmlAgilityPack.CssSelectors.NetCore 可能是常见的选择现在
我有
<p class="MyClass">
<span>Value:</span>
12345
</p>
我只想检索 12345
,如果可能的话,谢谢
不确定是否是最优雅的解决方案,但是
using HtmlAgilityPack;
using System;
using ScrapySharp.Extensions;
using System.Linq;
using HtmlAgilityPack.CssSelectors.NetCore;
namespace Whosebug
{
class Program
{
static void Main(string[] args)
{
var doc = new HtmlDocument();
doc.LoadHtml(@"<p class='MyClass'>
<span>Value:</span>
12345
</p>");
//Using ScrapySharp.Extensions
var p = doc.DocumentNode.CssSelect("p")?.FirstOrDefault();
var span = p.CssSelect("span")?.FirstOrDefault();
Console.WriteLine(p.InnerText.Replace(span.InnerHtml, string.Empty)?.Trim());
//Using HtmlAgilityPack.CssSelectors.NetCore
var results = doc.QuerySelectorAll("p")?.Select(p => p.InnerText.Replace(p.QuerySelector("span").InnerHtml, string.Empty)?.Trim());
foreach(var result in results)
Console.WriteLine(result);
}
}
}
P.S.:我习惯将 ScrapySharp 与 HtmlAgilityPack 结合使用,但看到有一个 HtmlAgilityPack.CssSelectors.NetCore 可能是常见的选择现在