如何从第一个 "select/filter" 运行 的列表 <HtmlNode> 创建一个新的 HtmlAgilityPack.HtmlDocument?
How to create a new HtmlAgilityPack.HtmlDocument from a List<HtmlNode> from a first "select/filter" run?
使用html agility pack。如何从我从原始 .html 中过滤掉的节点列表创建新的 HtmlAgilityPack.HtmlDocument?
//filter orig. .html and get all the nodes I want to edit later
LstAllTablesDocNodes =
htmlDoc.DocumentNode.SelectNodes("//table[@class='pricelist']").ToList();
//now pseudoCode: Of what I would like to do (this would give an Error)
HtmlAgilityPack.HtmlDocument htmlDoc2 =
new HtmlAgilityPack.HtmlDocument(LstAllTablesDocNodes);
循环检索到的节点并提取它们 html 并组合成一个字符串。然后将其放入您的新 HtmlDocument
。对于某些情况,例如对于 tr
节点,您可能需要父包装节点(table
在 tr
的情况下),以便不通过关闭文档中的 html 解析。
using System;
using HtmlAgilityPack;
using System.Text;
public class Program
{
public static void Main()
{
var hw = new HtmlAgilityPack.HtmlWeb();
var doc = new HtmlDocument();
doc = hw.Load("http://books.toscrape.com/");
var books = doc.DocumentNode.SelectNodes("//h3/a");
// Console.WriteLine(books.Count);
var output = new StringBuilder();
foreach(HtmlNode book in books)
{
output.Append(book.OuterHtml);
}
var doc2 = new HtmlDocument();
doc2.LoadHtml(output.ToString());
Console.WriteLine(doc2.DocumentNode.InnerHtml);
}
}
参考文献:
- HtmlAgilityPack substring of all by length
- https://www.tutorialsteacher.com/csharp/csharp-stringbuilder
使用html agility pack。如何从我从原始 .html 中过滤掉的节点列表创建新的 HtmlAgilityPack.HtmlDocument?
//filter orig. .html and get all the nodes I want to edit later
LstAllTablesDocNodes =
htmlDoc.DocumentNode.SelectNodes("//table[@class='pricelist']").ToList();
//now pseudoCode: Of what I would like to do (this would give an Error)
HtmlAgilityPack.HtmlDocument htmlDoc2 =
new HtmlAgilityPack.HtmlDocument(LstAllTablesDocNodes);
循环检索到的节点并提取它们 html 并组合成一个字符串。然后将其放入您的新 HtmlDocument
。对于某些情况,例如对于 tr
节点,您可能需要父包装节点(table
在 tr
的情况下),以便不通过关闭文档中的 html 解析。
using System;
using HtmlAgilityPack;
using System.Text;
public class Program
{
public static void Main()
{
var hw = new HtmlAgilityPack.HtmlWeb();
var doc = new HtmlDocument();
doc = hw.Load("http://books.toscrape.com/");
var books = doc.DocumentNode.SelectNodes("//h3/a");
// Console.WriteLine(books.Count);
var output = new StringBuilder();
foreach(HtmlNode book in books)
{
output.Append(book.OuterHtml);
}
var doc2 = new HtmlDocument();
doc2.LoadHtml(output.ToString());
Console.WriteLine(doc2.DocumentNode.InnerHtml);
}
}
参考文献:
- HtmlAgilityPack substring of all by length
- https://www.tutorialsteacher.com/csharp/csharp-stringbuilder