HtmlAgilityPack - 确保 CanOverlap 和 Closed 同时

HtmlAgilityPack - ensure CanOverlap and Closed at same time

如何定义HtmlNode.ElementsFlags["div"]可以CanOverlap并且必须同时Closed

我有这个HTML(正确的结构):

<p>
    <div>
        <b>text:</b> 
        <img alt="" src="#" style="BORDER: 0px solid; ">
    </div>
    <div>
        <b>text:</b> 
        <div></div>
        <div></div>
        <p>text</p>
    </div>
</p>

我需要确保所有标签都正确打开和关闭,我正在使用 HtmlAgilityPack 来做到这一点。但是 HtmlAgilityPack 正在改变我的 HTML 因为它没有假设标签 CanOverlap.

HTML 由 HtmlAgilityPack 返回(错误结构):

<p>
   <div>
      <b>text:</b>
      <img alt="" src="#" style="BORDER: 0px solid; " />
   </div>
   <div />
   <b>text:</b>
   <div />
   <div>
        <p>
            text
        </p>
   </div>
</p>

我该如何解决这个问题?我怎样才能告诉 HtmlAgilityPack 标签 CanOverlap 并确保标签是 Closed?

C#代码

if (!HtmlNode.ElementsFlags.ContainsKey("p"))
    HtmlNode.ElementsFlags.Add("p", HtmlElementFlag.Closed);
else
    HtmlNode.ElementsFlags["p"] = HtmlElementFlag.Closed;

if (!HtmlNode.ElementsFlags.ContainsKey("span"))
    HtmlNode.ElementsFlags.Add("span", HtmlElementFlag.Closed);
else
    HtmlNode.ElementsFlags["span"] = HtmlElementFlag.Closed;

if (!HtmlNode.ElementsFlags.ContainsKey("div"))
    HtmlNode.ElementsFlags.Add("div", HtmlElementFlag.Closed);
else
    HtmlNode.ElementsFlags["div"] = HtmlElementFlag.Closed;

var htmlDoc = new HtmlDocument();
htmlDoc.OptionFixNestedTags = true;
htmlDoc.OptionWriteEmptyNodes = true;
htmlDoc.LoadHtml(myHtml);

var htmlError = htmlDoc.ParseErrors.SafeAny();

if (!htmlError)
    myHtml = htmlDoc.DocumentNode.InnerHtml;

已解决!我们可以说 HtmlNode.ElementsFlags 应该是 ClosedCanOverlap,像这样:

if (!HtmlNode.ElementsFlags.ContainsKey("div"))
    HtmlNode.ElementsFlags.Add("div", HtmlElementFlag.CanOverlap & HtmlElementFlag.Closed);
else
    HtmlNode.ElementsFlags["div"] = HtmlElementFlag.CanOverlap & HtmlElementFlag.Closed;