如何获取 Parent Class 之外的元素

How to get elements that are out of Parent Class

我正在尝试从网络中提取一些数据。然而,并不是我需要的所有信息都在父 Class 中。我可以在 Parent class.

中获取信息

问题 - 如果数据在父 class 之外,是否有获取数据的方法? 或者是否存在一种设置以下代码以在不使用父 class.

的情况下提取的方法

Link

我正在使用 IE,因为它允许我搜索网站。我已经尝试了几种代码变体,但是,额外的信息不是我试图从中提取的父 class。

我在寻找名称、位置和社交媒体链接。位置在网页顶部 class

我尝试对父 class shop-home 使用以下内容,因为所有其他 class 都属于它,但它没有用。我从来没有尝试获取不在父 class 中的数据,所以,不是 100% 确定如何去做。 SIM 帮助 element.ParentNode.ParentNode.getElementsByClassName 因为产品 url 在父级之前。我一直在尝试将它用于父级之外的所有其他数据,但是我无法让它工作。如果有人可以解释 .ParentNode.ParentNode. 正在做什么,我不完全理解这将有助于我的理解,我可能能够自己解决其余的问题。

下面的代码适用于前两项,效果很好,除了 If element.getElementsByClassName("CLASS HERE")(0) 之外,所有项目的代码布局都相同。我试过使用 ID Tag Span 等等 If element.getElementsByClassName("CLASS HERE")(0).getelementsByTagName ("Span") (0)

        Application.ScreenUpdating = False
        Set HTML = objIE.document

''''########## Setting the Parent Class HERE ##########
       Set elements = HTML.getElementsByClassName("v2-listing-card__info") 
         
    ''''Scrolls Down the Browser 
   objIE.document.parentWindow.Scroll 0&, 9999 ' Scrolls Down the Browser
       
    ''''FOR LOOP
        For Each element In elements
''' Element 1
        If element.ParentNode.ParentNode.getElementsByClassName("listing-link")(0) Is Nothing Then 
            wsSheet.Cells(sht.Cells(sht.Rows.Count, "A").End(xlUp).Row + 1, "A").Value = "-" 
        Else
            HtmlText = element.ParentNode.ParentNode.getElementsByClassName("listing-link")(0).href 
            wsSheet.Cells(sht.Cells(sht.Rows.Count, "A").End(xlUp).Row + 1, "A").Value = HtmlText 
        End If
''' Element 2
        If element.getElementsByTagName("h3")(0) Is Nothing Then 
            wsSheet.Cells(sht.Cells(sht.Rows.Count, "B").End(xlUp).Row + 1, "B").Value = "-" 
        Else
            HtmlText = element.getElementsByTagName("h3")(0).innerText ' Get CLASS and Child Nod 'src
            wsSheet.Cells(sht.Cells(sht.Rows.Count, "B").End(xlUp).Row + 1, "B").Value = HtmlText 'return value in column
        End If
''' Element 3

结果 - 红色日期错误或缺失,因为它不在上述父项中 class

H 列中的运费与父项中的一样正常,如果没有运费信息,则连字符会进入单元格。 C、D、E 的项目不在我正在使用的父项 class 中。

<div class="flex-grow-1">
  <div class="max-width-760px ">


  </div>

  <div class="max-width-676px">
    <div class="">
      <p class="wt-text-heading-02 wt-display-inline" data-inplace-editable-text="story_headline" data-endpoint="AboutPost" data-key="story_headline" data-placeholder="Sum up what you do in one sentence. Or just write something catchy." data-use-inplace-input="1"
        data-add-class="normal story-headline-edit-link"></p>
    </div>
    <div class="">
      <div id="about-story" class="" aria-hidden="false">
        <p class="about-story text-body-larger text-gray-lighter ">
          <span class="mt-xs-1" data-inplace-editable-text="story" data-endpoint="AboutPost" data-key="story" data-placeholder="How did you get started? What inspires you? We know each seller’s story is unique — tell yours here."></span>
        </p>

      </div>
      <div class="wt-text-center-xs">

      </div>
    </div>
  </div>

  <div class="wt-mb-xs-6 wt-mb-md-8">
    <div class="clearfix"></div>

    <div>
      <h3 class="wt-text-title-01"></h3>
      <div class="pt-xs-2 pt-lg-4">
        <div class="display-flex-md flex-wrap max-width-760px">
          <div class="mb-xs-2 text-body mr-md-6">
            <a href="https://www.facebook.com/Lucky-Plum-706715642737271/" class="text-decoration-none clearfix" title="Facebook" target="_blank" rel="nofollow noopener">
              <span class="etsy-icon"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" aria-hidden="true" focusable="false"><path d="M20,5V19a1.007,1.007,0,0,1-1,1H15V13.776h2l0.336-2.3H15V9.659a0.912,0.912,0,0,1,1-1.031h1.5V6.55a11.284,11.284,0,0,0-1.641-.109c-2.2,0-3.3,1.219-3.3,3.039v1.992h-2v2.3h2V20H5a1.007,1.007,0,0,1-1-1V5A1.007,1.007,0,0,1,5,4H19A1.007,1.007,0,0,1,20,5Z"></path></svg></span>
              <span>Facebook</span>
            </a>
          </div>
          <div class="mb-xs-2 text-body mr-md-6">
            <a href="https://www.instagram.com/luckyplumstudio/" class="text-decoration-none clearfix" title="Instagram" target="_blank" rel="nofollow noopener">
              <span class="etsy-icon"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" aria-hidden="true" focusable="false"><path d="M12,5.447c2.136,0,2.389,0.008,3.233,0.047c0.78,0.036,1.204,0.166,1.485,0.275c0.373,0.145,0.64,0.318,0.92,0.598 c0.28,0.28,0.453,0.546,0.598,0.92c0.11,0.282,0.24,0.706,0.275,1.485c0.038,0.844,0.047,1.097,0.047,3.233 s-0.008,2.389-0.047,3.233c-0.036,0.78-0.166,1.204-0.275,1.485c-0.145,0.373-0.318,0.64-0.598,0.92 c-0.28,0.28-0.546,0.453-0.92,0.598c-0.282,0.11-0.706,0.24-1.485,0.275c-0.843,0.038-1.096,0.047-3.233,0.047 s-2.389-0.008-3.233-0.047c-0.78-0.036-1.204-0.166-1.485-0.275c-0.373-0.145-0.64-0.318-0.92-0.598 c-0.28-0.28-0.453-0.546-0.598-0.92c-0.11-0.282-0.24-0.706-0.275-1.485c-0.038-0.844-0.047-1.097-0.047-3.233 S5.45,9.616,5.488,8.773c0.036-0.78,0.166-1.204,0.275-1.485c0.145-0.373,0.318-0.64,0.598-0.92c0.28-0.28,0.546-0.453,0.92-0.598 c0.282-0.11,0.706-0.24,1.485-0.275C9.611,5.455,9.864,5.447,12,5.447 M12,4.005c-2.173,0-2.445,0.009-3.298,0.048 C7.85,4.092,7.269,4.227,6.76,4.425C6.234,4.63,5.787,4.903,5.343,5.348C4.898,5.793,4.624,6.239,4.42,6.765 c-0.198,0.509-0.333,1.09-0.372,1.942C4.009,9.56,4,9.833,4,12.005c0,2.173,0.009,2.445,0.048,3.298 c0.039,0.852,0.174,1.433,0.372,1.942c0.204,0.526,0.478,0.972,0.923,1.417c0.445,0.445,0.891,0.718,1.417,0.923 c0.509,0.198,1.09,0.333,1.942,0.372c0.853,0.039,1.126,0.048,3.298,0.048s2.445-0.009,3.298-0.048 c0.852-0.039,1.433-0.174,1.942-0.372c0.526-0.204,0.972-0.478,1.417-0.923c0.445-0.445,0.718-0.891,0.923-1.417 c0.198-0.509,0.333-1.09,0.372-1.942C19.991,14.45,20,14.178,20,12.005s-0.009-2.445-0.048-3.298 c-0.039-0.852-0.174-1.433-0.372-1.942c-0.204-0.526-0.478-0.972-0.923-1.417c-0.445-0.445-0.891-0.718-1.417-0.923 c-0.509-0.198-1.09-0.333-1.942-0.372C14.445,4.014,14.173,4.005,12,4.005L12,4.005z"></path><path d="M12,7.897c-2.269,0-4.108,1.839-4.108,4.108S9.731,16.113,12,16.113s4.108-1.839,4.108-4.108S14.269,7.897,12,7.897z  M12,14.672c-1.473,0-2.667-1.194-2.667-2.667S10.527,9.339,12,9.339s2.667,1.194,2.667,2.667S13.473,14.672,12,14.672z"></path><circle cx="16.27" cy="7.735" r="0.96"></circle></svg></span>
              <span>Instagram</span>
            </a>
          </div>
        </div>
      </div>
    </div>
  </div>

  <div class="wt-mb-xs-8 wt-mb-md-10">
    <div class="clearfix"></div>

    <div class="about-section display-flex-md flex-direction-column-md  mb-md-5 pl-xs-0 pr-xs-0" data-region="shop-members" id="shop-members">
      <div class="p-xs-0">
        <h3 class="wt-text-title-01">Shop members</h3>
      </div>
      <div class="pl-xs-0 pr-xs-0  pt-xs-2 pt-lg-4">
        <div class="max-width-760px">
          <ul class="list-unstyled block-grid-md-2" data-region="shop-member-list">
            <li class="pt-xs-2 pb-xs-2 block-grid-item" data-region="shop-member" data-member-id="22676501471" data-member-avatar-url="https://i.etsystatic.com/isc/87253d/22676501471/isc_90x90.22676501471_6w54.jpg?version=0" data-member-bio="" data-member-role="Owner"
              data-member-name="Lucky Plum Studio">
              <div class="flag">
                <div class="flag-img vertical-align-top pr-lg-3">
                  <img src="https://i.etsystatic.com/isc/87253d/22676501471/isc_90x90.22676501471_6w54.jpg?version=0" alt="" class="circle" data-region="member-avatar" width="48" height="48">
                </div>
                <div class="flag-body">
                  <h6 class="mb-xs-0 b text-transform-none text-body" data-region="member-name">Lucky Plum Studio</h6>
                  <p class="prose" data-region="member-role">Owner</p>
                  <p class="text-gray-lighter mb-xs-0" data-region="member-bio">

                  </p>
                </div>
              </div>
            </li>
          </ul>
        </div>
      </div>
    </div>
  </div>

  <div class="">

  </div>
</div>

一如既往地提前致谢

''######### 英国时间今天 22/3/2021 下午 6 点更新#########

回复 Qharr 的回答。我有这个用于定位,但没有收集到任何东西,你能解释一下我哪里出错了吗,我应该能够解决其余的问题

''' Element 4
DoEvents
          If element.getElementsByClassName("shop-location")(0).getElementsByTagName("Span")(0) Is Nothing Then ' Get CLASS and Child Nod
            wsSheet.Cells(sht.Cells(sht.Rows.Count, "D").End(xlUp).Row + 1, "D").Value = "-" 
        Else
            HtmlText = element.getElementsByClassName("shop-location")(0).getElementsByTagName("Span")(0).innerText 
            wsSheet.Cells(sht.Cells(sht.Rows.Count, "D").End(xlUp).Row + 1, "D").Value = HtmlText 
        End If

除了阅读 html 和 html 文档方法/css 选择器之外,我不确定该说些什么,以便了解您需要应用的模式。剩下的只是练习和学习,这是最快和更可靠的方法。


CSS:

  1. 位置:.shop-location span 是一个 span 子元素,其父元素具有 class shop-location

  2. 社交媒体链接:#about .text-decoration-none 个子节点有一个 class 名称 text-decoration-none,父节点 ID about.

  3. 名称:具有 data-region 属性且值为 member-name

    [data-region='member-name'] 元素

了解 css 选择器和后代组合器 here

练习 css 选择器 here

了解 html here


VBA:

Option Explicit
Public Sub GetInfo()
    Dim ie As SHDocVw.InternetExplorer

    Set ie = New SHDocVw.InternetExplorer

    With ie

        .Visible = True
        .Navigate2 "https://www.etsy.com/uk/shop/LuckyPlumStudio"
        While .Busy Or .readyState <> READYSTATE_COMPLETE: DoEvents: Wend

        With .document
        
            Debug.Print .querySelector(".shop-location span").innerText 'location
            
            Dim i As Long, socialMedias As Object
            
            Set socialMedias = .querySelectorAll("#about .text-decoration-none")
  
            For i = 0 To socialMedias.Length - 1 'media links
                Debug.Print socialMedias.Item(i).href
            Next
            
            Debug.Print .querySelector("[data-region='member-name']").innerText 'company name
            
        End With
        .Quit
    End With

End Sub

不太理想的选择方法:

Option Explicit

Public Sub GetInfo()
    Dim ie As SHDocVw.InternetExplorer

    Set ie = New SHDocVw.InternetExplorer

    With ie

        .Visible = True
        .Navigate2 "https://www.etsy.com/uk/shop/LuckyPlumStudio"
        While .Busy Or .readyState <> READYSTATE_COMPLETE: DoEvents: Wend

        With .document
        
            Debug.Print .getElementsByClassName("shop-location wt-display-flex-xs")(0).getElementsByTagName("span")(0).innerText 'location
            
            Dim i As Object, socialMedias As Object
            
            Set socialMedias = .getElementById("about").getElementsByClassName("text-decoration-none clearfix")
  
            For Each i In socialMedias           'media links
                Debug.Print i.href
            Next
            
            Debug.Print .getElementById("about").getElementsByClassName("flag")(0).getElementsByTagName("h6")(0).innerText 'company name
            
        End With
        .Quit
    End With

End Sub