VBA .innertext 使用 getelements*,无法定位 html 代码中的区域

VBA .innertext using getelements*, cannot locate areas in html code

我正在尝试创建一个宏来从内部网页获取内部文本。我不确定如何正确定位文本所在的位置,希望能得到一些指导,并可能对方法进行一些解释。

我尝试了多种使用 getelementsby/tagname/classname 的变体,但都无济于事。我不确定我是否理解使用检查功能后定位区域背后的逻辑。

Var = ie.document.getelementClassName("sections").getElementsByTagName("table").Item(0).innerText

'also tried
Var = ie.document.getelementClassName("sections").getElementsByTagName("table").Item(1).getElementsByTagName("tr").Item(2).getElementsByTagName("td").Item(0).innerText

Var = ie.document.getelementTagName("section").getElementsByTagName("table").Item(1).getElementsByTagName("tr").Item(2).getElementsByTagName("td").Item(0).innerText


ActiveCell.Offset(0, 1).Value = Var
<html class=" js flexbox canvas canvastext webgl no-touch geolocation postmessage websqldatabase indexeddb hashchange history draganddrop websockets rgba hsla multiplebgs backgroundsize borderimage borderradius boxshadow textshadow opacity cssanimations csscolumns cssgradients cssreflections csstransforms csstransforms3d csstransitions fontface no-generatedcontent video audio localstorage sessionstorage webworkers no-applicationcache svg inlinesvg smil svgclippaths" lang="" style=""><!--<![endif]--><head>
    <meta charset="utf-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
    <title>NTC Tracking</title>
    <meta name="description" content="">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <link rel="apple-touch-icon" href="apple-touch-icon.png">

    <link rel="stylesheet" href="/Content/bootstrap.min.css">
    <!--        <link rel="stylesheet" href="~/Content/bootstrap-theme.min.css">-->
    <!--For Plugins external css-->
    <link rel="stylesheet" href="/Content/plugins.css">



    <!--Theme custom css -->
    <link rel="stylesheet" href="/Content/style.css">

    <!--Theme Responsive css-->
    <link rel="stylesheet" href="/Content/responsive.css">

    <script src="/Scripts/vendor/modernizr-2.8.3-respond-1.4.2.min.js"></script>


</head>
<body data-spy="scroll" data-target="#main-navbar">
    <!--[if lt IE 8]>
        <p class="browserupgrade">You are using an <strong>outdated</strong> browser. Please <a href="http://browsehappy.com/">upgrade your browser</a> to improve your experience.</p>
    <![endif]-->

    <div class="preloader" style="display: none;"><div class="loaded" style="display: none;">&nbsp;</div></div>
    <div id="menubar" class="main-menu">
        <nav class="navbar-default navbar-fixed-top" style="background-color:#ffc038; padding:20px;">
            <div class="container">
                <!-- Brand and toggle get grouped for better mobile display -->
                <div class="navbar-header">
                    <button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#bs-example-navbar-collapse-1" aria-expanded="false">
                        <span class="sr-only">Toggle navigation</span>
                        <span class="icon-bar"></span>
                        <span class="icon-bar"></span>
                        <span class="icon-bar"></span>
                    </button>
                    <a class="" href="http://10.102.18.162/"><img src="/images/msjlogo.png" style="max-width:50%; margin-top:-20px;"></a>

                </div>

                <!-- Collect the nav links, forms, and other content for toggling -->
                <div class="collapse navbar-collapse" id="bs-example-navbar-collapse-1">
                    <ul class="nav navbar-nav navbar-right">
                        <li><a href="#"><span style="font-weight:100; font-size:10px;">Proxy Plus(5/22/2019 8:20:00 AM) | IQ(5/22/2019 8:31:00 AM) | Vendors(5/21/2019 2:24:00 PM) | USPS(5/22/2019 8:43:00 AM) | International(5/21/2019 2:24:00 PM)</span></a></li>



                    </ul>
                </div><!-- /.navbar-collapse -->
                <div>
                    <a class="navbar-brand" href="/"><h3>NTC Tracking</h3></a>
                </div>
                <div style="clear:both; margin-bottom:20px;"></div>
                <div>

            </div>
            </div><!-- /.container-fluid -->
        </nav>
    </div>
    <!--Home page style-->
    <header id="home" class="sections">

    </header>

    <!-- Sections -->




<header id="home">
    <div class="container">
        <h2 align="center">Search Job</h2>
        <div class="col-md-6 col-md-offset-3 col-sm-6 col-xs-12">
            <p align="center">Description of this view, testing space and top bar at the same time</p>
            <p align="center">There are some hiding fields due to the web space, if you want to see them click on export</p>
            <p align="center">Description of this view, testing space and top bar at the same time</p>
            <p align="center">Description of this view, testing space and top bar at the same time</p>
        </div>

    </div>
    <br>
</header>
<section class="sections">
    <div class="portfolio">
        <div align="center" class="portfolio-item">
            <h5 align="center">Job Number </h5><input id="PPNumber" name="PPNumber" type="text" value="P23315"><br>
            <a onclick="submitdata();" href="#" class="btn btn-primary">Search </a>
            <br><br>

                <div>

                    <p></p>
                </div>
                <div style="float:left"><h3 align="left">JOB</h3></div>
                <table class="table" style="font-size:11px;">
                    <tbody><tr>
                        <th>
                            Job #
                        </th>
                        <th width="20%">
                            Job Name
                        </th>
                        <th>
                            MeetingDate
                        </th>
                        <th>
                            DropDate
                        </th>
                        <th>
                            NTCMailDate
                        </th>
                        <th>
                            LI#
                        </th>
                        <th>
                            Total Pieces
                        </th>
                        <th width="5%">
                            Day 40 On
                        </th>
                        <th width="5%">
                            Logistics Processed
                        </th>
                        <th width="5%">
                            IQ Status
                        </th>
                        <th>
                            MustMail Comments
                        </th>
                        <th>
                            Total Batch Completed
                        </th>
                        <th>
                            Actual Status
                        </th>
                        <th>
                            Options
                        </th>

                    </tr>
                        <tr>
                            <td>
<a href="/Report/Batchdetail/P23315-010" target="_blank">P23315-010</a>                            </td>
                            <td width="20%">
                                ATLANTICA YIELD PLC      <----****I NEED THIS****               
                            </td>
                            <td>
                                6/20/2019
                            </td>
                            <td>
                                5/13/2019
                            </td>
                            <td>
                                5/13/2019
                            </td>
                            <td>
                                LI-8154090
                            </td>
                            <td>
                                2200
                            </td>
                            <td width="5%">
                                5/11/2019
                            </td>
                            <td width="5%">
                                4386
                            </td>
                            <td width="5%">
                                Mailed
                            </td>
                            <td>
                                MUST MAIL 5/14
                            </td>
                            <td>
                                11 out of 11
                            </td>
                                <td>
                                    Foreign Client
                                </td>
                                                                                <td>
                                                <a class="btn btn-default" href="/Report/Reopenjob?jobnumber=P23315&amp;jobref=P23315-010">Reopen Job</a>
                                            </td>
                        </tr>
                </tbody></table>
                <br>
                 <br>
         </div>
    </div>
</section>
<script>

function submitdata(){


    var valtext = $("#PPNumber").val();//you can do also by  getelementbyid
    window.location.href = '/Report/Search/' + valtext;

}
function ShowMessage() {
    var result = prompt("Please insert a comment if required.", "");
    if (result == null) {
        return false; //break out of the function early
    }
    document.getElementById('comments').value = result;
    return true;
}

</script>






    <div class="scroll-top">

        <div class="scrollup">
            <i class="fa fa-angle-double-up"></i>
        </div>

    </div>

    <!--Footer-->
    <footer id="footer" class="footer">
        <div class="container">

            <div class="row">


                <div class="socio-copyright">

                    <div class="social">

                    </div>

                    <p>Made by Broadridge 2017. All rights reserved.</p>
                </div>

            </div>
        </div>

    </footer>
    <script src="/Scripts/vendor/bootstrap.min.js"></script>

    <script src="/Scripts/vendor/jquery-1.11.2.min.js"></script>
    <script src="/Scripts/plugins.js"></script>
    <script src="/Scripts/main.js"></script>



</body></html>

错误 438

下面是如何访问 HTML table 的单元格的一般逻辑:

Sub test()
Dim sht As Worksheet
Dim doc As New HTMLDocument
Dim targetTable As HTMLTable
Set sht = ThisWorkbook.Worksheets("Sheet1")
doc.body.innerHTML = sht.Range("M1") 'I just stored the html code in cell M1 as a string for the sake of demonstration

Set targetTable = doc.getElementsByClassName("table")(0) 'Get the first element from a collection of elements whose class name is "table"
Debug.Print targetTable.Rows(0).Cells(0).innerText 'Get the first row from the collection of rows that belong to the table and the first cell from the collection of cells that belong to this row.
End Sub

上面的代码将在立即 window 中打印 Job #。这是第一行第一个单元格的内部文本(即第一列的 header)。您可以相应地获取其余值。

要记住的事情:

  1. doc.getElementsByClassName("table") 是 collection 个元素,其 class 名称为 "table"
  2. 同样适用于.getElementsByTagName
  3. collection 中的第一项的索引为 0
  4. 您可以使用 For-Each
  5. 遍历 collection 中的所有元素
  6. 这个.getelementClassName是错误的
  7. 这个doc.getElementsByClassName("table")(0).getElementsByTagName("td")(0).innerText是正确的
  8. 您可以像 doc.getElementsByClassName("Something").Item(0)doc.getElementsByClassName("Something")(0)
  9. 那样访问 collection 中的项目

错误:

方法是

getElementsByClassName 

getElementsByTagName

这些 return 集合,然后您将其编入索引,例如

ie.document.getElementsByClassName("className")(0)  'first element

进行这些更改应该可以解决您的初始错误。


针对特定的行和列:

如果使用 IE 进行自动化,则可以使用 nth-of-type,即 tr:nth-of-type(rowNumberHere)td:nth-of-type(columnNumberHere)

我认为你在第二行第一列之后,所以我会使用 css 选择器

ie.document.querySelector(".table tr:nth-of-type(2) td:nth-of-type(1)").innerText

现代浏览器针对 css 选择器进行了优化,因此这应该是一种有效的方法。


整个table:

复制整个 table 的一种简单方法是使用剪贴板

Option Explicit
Public Sub GetInfo()
    Dim ie As New InternetExplorer, url As String, ws As Worksheet
    Dim t As Date, clipboard As Object, hTable As Object
    url = "url"
    Const MAX_WAIT_SEC As Long = 10

    Set ws = ThisWorkbook.Worksheets("Sheet1")
    Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")

    With ie
        .Visible = True
        .Navigate2 url

        While .Busy Or .readyState < 4: DoEvents: Wend

        With .document

            t = Timer
            Do
                On Error Resume Next
                Set hTable = .querySelector(".table")
                On Error GoTo 0
                If Timer - t > MAX_WAIT_SEC Then Exit Do
            Loop While hTable Is Nothing
        End With

        If hTable Is Nothing Then Exit Sub
            clipboard.SetText hTable.outerHTML
            clipboard.PutInClipboard
            ws.Range("A1").PasteSpecial
        .Quit
    End With
End Sub

循环行和列 table:

如果要循环 table 的行和列并写出,请参阅