按属性获取HTML个元素

Get HTML element by attribute

我正在使用 AutoIt 来解析 HTML。我想通过属性值获取所有 HTML 元素。示例:

<div data-source="xxx">The div content XXX</div>
<div data-source="zzz">The div content of ZZZ</div>

应选择包含属性值对 data-source="xxx" 的 div 元素。

您可以使用 RegExp 尝试类似的操作:

#include <Array.au3>
$Data = '<div data-source="xxx">The div content XXX</div>' & @CRLF & _
'<div data-source="zzz">The div content XXX</div>'
MsgBox ("","",$Data)
Local $array = StringRegExp($Data,"(\s.*=\x22\w.*\x22)",3)
_ArrayDisplay($array)
For $i=0 to Ubound($array)
    MsgBox ("","", $array[$i])
Next

这是另一个示例,向您展示如何阅读 file.html 内容并向您展示提取的数据:

#include <Array.au3>
#include <FileConstants.au3>
Local Const $sFilePath = "Example.html"
; Open the file for reading and store the handle in a variable.
Local $hFileOpen = FileOpen($sFilePath, $FO_READ)
; Reads the contents of the file using the handle returned by FileOpen.
Local $sFileRead = FileRead($hFileOpen)
; Closes the handle returned by FileOpen.
FileClose($hFileOpen)
$Data = $sFileRead
Local $array = StringRegExp($Data, "(\s.*=\x22\w.*\x22)", 3)
_ArrayDisplay($array)
For $i = 0 To UBound($array)
    MsgBox("", "", $array[$i])
Next

试试这个?

$ohtml = ObjCreate('HTMLFILE')
$ohtml.body.innerHTML = '<div data-source="xxx">The div content XXX</div>' & @CRLF & _
                        '<div data-source="zzz">The div content of ZZZ</div>'    

Dim $selected_node
For $div in $ohtml.body.getElementsByTagName("div")
    If $div.getAttribute("data-source") = 'xxx' Then
        $selected_node = $div
        ExitLoop
    EndIf
Next

ConsoleWrite($selected_node.innerHTML & @CRLF)