VBA 当前支持的 CSS 选择器有哪些?

What are the currently supported CSS selectors available to VBA?

早在 2021 年 5 月 19 日,我就最近(4 月-5 月 21 日)疑似与 mshtml.dll 和延迟绑定引用相关的接口更改写了 问答。如果您愿意,这是第 2 部分。

之前,在 and , I have remarked upon the lack of support for various CSS selectors with mshtml.dll, in particular regarding pseudo-classes等问题中。在上述问题中,我强调 nth-child()nth-of-type() 未针对 MSHTML 实施。

通常,如 here 所示,不支持的选择器语法会导致:

Run-time error '-2140143604 (8070000c)': Could not complete the operation due to error 8070000c.

我预计一些事情会中断,因为与 Internet Explorer (IE) 有关的各种 versions/platforms 不再受支持(MSHTML 与之相关 - 请参阅我的 。什么我没有期待 要查找的是支持的 CSS 选择器的最新改进。举个例子:

Option Explicit

''Required references:
'' Microsoft HTML Object Library

Public Sub CssTest()

    Const URL = "https://books.toscrape.com/"
    Dim html As MSHTML.HTMLDocument
   
    Set html = New MSHTML.HTMLDocument

    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", URL, False
        .send
        html.body.innerHTML = .responseText
    End With
    
    Debug.Print html.querySelector("meta:nth-of-type(2)").outerHTML
    
End Sub

在 21 年 4 月至 5 月之前,由于使用了未实现的语法,这会出错。 现在,在我的设置中,我在 5 月初(最近)看到 mshtml.dll 的更新,我得到的结果与我通过自动 Internet Explorer 实例 运行 得到的结果相同,它是已经支持:

<meta name="created" content="24th Jun 2016 09:29">

那么,VBA 目前支持的 CSS 选择器有哪些?

'why do we care?'我在之前的问答中已经讲过了,这里就不再重复了。但是,我会重新说明我的设置:

我的设置:

OS Name Microsoft Windows 10 Pro
Version 10.0.19042 Build 19042
System Type x64-based PC
Microsoft® Excel® 2019 MSO (16.0.13929.20206) 32-bit (Microsoft Office Professional Plus)
Version 2104 Build 13929.20373
mshtml.dll file  11.00.19041.985
ieframe.dll file 11.0.19041.964

意见反馈:

与之前的问答一样,对于 do/do 未看到这些更改的设置的任何反馈,我将不胜感激。我会对此进行反馈,以供其他人参考。

tl;dr;

对 css 选择器和 Element.querySelector 有更大的支持(允许更灵活地链接 querySelector(All) 调用。这极大地增强了 MSHTML class,在 CSS 选择器方面,与 Selenium Basic.

相当

动机:

一段时间以来,我一直想写一个受支持的选择器列表,因为缺少与 VBA 相关的文档,以及学习做什么和不做什么的反复试验性质'不工作。这个最新的变化促使我这样做,并包括那些目前支持在其中使用 CSS 选择器的库。


注意事项:

  1. 这并不详尽;很全面。
  2. 如果您发现任何错误,特别是关于 Selenium Basic 的错误,这是我凭记忆写的,请通知我,我会相应地进行编辑。
  3. 最近的更改,由摘要中的阴影单元格表示 table (JSFiddle)|在下面的简化 table 中标有 ✔* 的内容与我此时的设置有关。您的里程可能会有所不同,例如CSS 完全不支持选择器 < IE8。

之前和之后:

传统上,CSS 选择器在 VBA 中的表现力如下,关于支持它们的库:

Selenium 实现了迄今为止最多的 CSS 选择器。

当前状态:

我认为已实施的选择器的当前状态如下(图像质量很抱歉,即使您点击放大 table - 请参阅 JSFiddle 以获得最清晰的 table 视图):

我也将其作为简化的 HTML 插入,因此您可以单击 hyperlinks。请点击代码插入下方的 运行 代码片段,然后点击 整页 link。抱歉,table 很大,我什至没有涵盖所有可能的选择器 - 只有我认为可能经常使用的主要选择器。插入一个花哨的 table 让我超出了正文字符限制,所以我们到了。想看一下 table 请看这个 JSFiddle - 新支持的有阴影。

<!DOCTYPE html>
<html>
<head>
    <title>VBA: Valid CSS Selectors 2021-05-30</title>
</head>
<body>
    <h1>VBA: Valid CSS Selectors 2021-05-30</h1>
    <table>
        <tr>
            <td colspan="2">
                <a href="https://drafts.csswg.org/selectors-3/">Selectors Level 3 Specification</a>
            </td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
        </tr>
        <tr>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
        </tr>
        <tr>
            <td>Pattern</td>
            <td>Represents</td>
            <td>Description</td>
            <td>Level</td>
            <td>Microsoft HTML Object Library (MSHTML)</td>
            <td>Microsoft Internet Explorer Controls (SHDocVw)</td>
            <td>Selenium Type Library (Selenium)</td>
            <td>Remarks</td>
        </tr>
        <tr>
            <td>*</td>
            <td>any element</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#universal-selector">Universal selector</a>
            </td>
            <td>2</td>
            <td>✔</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E</td>
            <td>an element of type E</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#type-selectors">Type selector</a>
            </td>
            <td>1</td>
            <td>✔</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E[foo]</td>
            <td>an E element with a "foo" attribute</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#attribute-selectors">Attribute selectors</a>
            </td>
            <td>2</td>
            <td>✔</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E[foo="bar"]</td>
            <td>an E element whose "foo" attribute value is exactly equal to "bar"</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#attribute-selectors">Attribute selectors</a>
            </td>
            <td>2</td>
            <td>✔</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E[foo~="bar"]</td>
            <td>an E element whose "foo" attribute value is a list of whitespace-separated values, one of which is exactly equal to "bar"</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#attribute-selectors">Attribute selectors</a>
            </td>
            <td>2</td>
            <td>✔</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E[foo^="bar"]</td>
            <td>an E element whose "foo" attribute value begins exactly with the string "bar"</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#attribute-selectors">Attribute selectors</a>
            </td>
            <td>3</td>
            <td>✔</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E[foo$="bar"]</td>
            <td>an E element whose "foo" attribute value ends exactly with the string "bar"</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#attribute-selectors">Attribute selectors</a>
            </td>
            <td>3</td>
            <td>✔</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E[foo*="bar"]</td>
            <td>an E element whose "foo" attribute value contains the substring "bar"</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#attribute-selectors">Attribute selectors</a>
            </td>
            <td>3</td>
            <td>✔</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E[foo|="en"]</td>
            <td>an E element whose "foo" attribute has a hyphen-separated list of values beginning (from the left) with "en"</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#attribute-selectors">Attribute selectors</a>
            </td>
            <td>2</td>
            <td>x</td>
            <td>x</td>
            <td>x</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E[attr operator value i]</td>
            <td>value compared case-insensitively (ASCII range).</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#attribute-selectors">Attribute selectors</a>
            </td>
            <td>4</td>
            <td>x</td>
            <td>x</td>
            <td>?</td>
            <td>
                <a href="https://www.w3.org/TR/selectors-4/#attribute-case">i identifier</a>
            </td>
        </tr>
        <tr>
            <td>E[attr operator value s]</td>
            <td>value compared case-sensitively (ASCII range).</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#attribute-selectors">Attribute selectors</a>
            </td>
            <td>4</td>
            <td>x</td>
            <td>x</td>
            <td>x</td>
            <td>
                <a href="https://www.w3.org/TR/selectors-4/#attribute-case">s identifier</a>
            </td>
        </tr>
        <tr>
            <td>E:root</td>
            <td>an E element, root of the document</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
            </td>
            <td>3</td>
            <td>✔</td>
            <td>✔</td>
            <td>✔</td>
            <td>HTML node only</td>
        </tr>
        <tr>
            <td>E:nth-child(n)</td>
            <td>an E element, the n-th child of its parent</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
            </td>
            <td>3</td>
            <td>✔*</td>
            <td>✔</td>
            <td>✔</td>
            <td>nth-child(odd) and (even) as well as nth-child(range) also supported</td>
        </tr>
        <tr>
            <td>E:nth-last-child(n)</td>
            <td>an E element, the n-th child of its parent, counting from the last one</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
            </td>
            <td>3</td>
            <td>✔*</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E:nth-of-type(n)</td>
            <td>an E element, the n-th sibling of its type</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
            </td>
            <td>3</td>
            <td>✔*</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E:nth-last-of-type(n)</td>
            <td>an E element, the n-th sibling of its type, counting from the last one</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
            </td>
            <td>3</td>
            <td>✔*</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E:first-child</td>
            <td>an E element, first child of its parent</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
            </td>
            <td>2</td>
            <td>✔</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E:last-child</td>
            <td>an E element, last child of its parent</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
            </td>
            <td>3</td>
            <td>✔</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E:first-of-type</td>
            <td>an E element, first sibling of its type</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
            </td>
            <td>3</td>
            <td>✔*</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E:last-of-type</td>
            <td>an E element, last sibling of its type</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
            </td>
            <td>3</td>
            <td>✔*</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E:only-child</td>
            <td>an E element, only child of its parent</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
            </td>
            <td>3</td>
            <td>✔*</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E:only-of-type</td>
            <td>an E element, only sibling of its type</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
            </td>
            <td>3</td>
            <td>✔*</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E:empty</td>
            <td>an E element that has no children (including text nodes)</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#structural-pseudos">Structural pseudo-classes</a>
            </td>
            <td>3</td>
            <td>✔*</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E:link</td>
            <td rowspan="2">an E element being the source anchor of a hyperlink of which the target is not yet visited (:link) or already visited (:visited)</td>
            <td rowspan="2">
                <a href="https://drafts.csswg.org/selectors-3/#link">The link pseudo-classes</a>
            </td>
            <td>1</td>
            <td>✔*</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E:visited</td>
            <td>1</td>
            <td>✔*</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E:not(s)</td>
            <td>an E element that does not match simple selector s</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#negation">Negation pseudo-class</a>
            </td>
            <td>3</td>
            <td>✔*</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E F</td>
            <td>an F element descendant of an E element</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#descendant-combinators">Descendant combinator</a>
            </td>
            <td>1</td>
            <td>✔</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E &gt; F</td>
            <td>an F element child of an E element</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#child-combinators">Child combinator</a>
            </td>
            <td>2</td>
            <td>✔</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E + F</td>
            <td>an F element immediately preceded by an E element</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#adjacent-sibling-combinators">Next-sibling combinator</a>
            </td>
            <td>2</td>
            <td>✔</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>E ~ F</td>
            <td>an F element preceded by an E element</td>
            <td>
                <a href="https://drafts.csswg.org/selectors-3/#general-sibling-combinators">Subsequent-sibling combinator</a>
            </td>
            <td>3</td>
            <td>✔</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td>foo, bar</td>
            <td>foo, bar&nbsp;will match both&nbsp;&lt;foo&gt;&nbsp;and&nbsp;&lt;bar&gt;&nbsp;elements.</td>
            <td>
                <a href="https://developer.mozilla.org/en-US/docs/Web/CSS/Selector_list">Selector list</a>
            </td>
            <td>1</td>
            <td>✔</td>
            <td>✔</td>
            <td>✔</td>
            <td>&nbsp;</td>
        </tr>
        <tr>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
        </tr>
        <tr>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
        </tr>
        <tr>
            <td>element.querySelector</td>
            <td>Expanded element.querySelector</td>
            <td>
                <a href="https://developer.mozilla.org/en-US/docs/Web/API/Element/querySelector">Element.querySelector</a>
            </td>
            <td>API</td>
            <td>✔</td>
            <td>✔</td>
            <td>✔</td>
            <td>Can now chain querySelector(All) calls on wider base node range</td>
        </tr>
        <tr>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
        </tr>
        <tr>
            <td>Lib info:</td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
        </tr>
        <tr>
            <td></td>
            <td>Microsoft HTML Object Library (MSHTML)</td>
            <td>MS Internet Explorer Controls (SHDocVw)</td>
            <td>Selenium Type Library (Chromedriver)</td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
        </tr>
        <tr>
            <td>Lib</td>
            <td>mshtml.dll</td>
            <td>ieframe.dll</td>
            <td>selenium.dll</td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
        </tr>
        <tr>
            <td>File Version</td>
            <td>11.00.19041.985</td>
            <td>11.0.19041.964</td>
            <td>2.0.9.0</td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
        </tr>
        <tr>
            <td>Date</td>
            <td>2021-05-12</td>
            <td>2021-05-12</td>
            <td>2016-03-02</td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
        </tr>
        <tr>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
        </tr>
    </table>
</body>
</html>


12 个新支持的伪classes 和扩展的Element.querySelector:

如果你 运行 上面的代码片段,并查看完整页面,你会看到现在至少有 12 个新支持的伪 classes 支持,以及提到的扩展Element.querySelector. Bam, kapow, ker-sploosh,关上众所周知的前门...欢迎来到VBA CSS迦南,Scraper的香格里拉,书呆子的天堂!

我认为 ieframe.dll 也可能有有趣的更新;这里的重点是最近的 mshtml.dll 变化。您可能希望在生命周期公告 here and here 下查看 IE 支持,或搜索 Lifecycle FAQ - Internet Explorer and Microsoft Edge

扩展Element.querySelector()的好处上次Q&A没有讲到,这里简单提一下。通过扩展,我的意思是可以调用 querySelector 的元素数量增加,这样您就可以链接 .querySelector().querySelector(..).querySelector(..).querySelector(..).querySelectorAll(..).

以前,这在很大程度上是不可能的。如 问题所示。通常,解决方法是将传统方法链接到返回的节点上,例如 html.querySelector("body").getElementsByTagName("li");这导致难看的链接和难以遵循,以及有限的目标元素路径。更好的是,恕我直言,代理 MSHTML.HTMLDocument 变量的想法,它将携带 querySelector 返回的当前节点的 innerHTML,从而允许您再次调用 querySelector(All) ;从而获得更快的匹配、更清晰的语法和更大的通用性。这种方法的许多例子 here.


尾注:

这是一份正在修订中的文件。欢迎所有关于改进的反馈。


感谢:

最后,非常感谢@SIM 运行使用我的测试脚本在不同的设置上检查它。