无法从图像数组 link 中抓取第一张图像 link

Can't grab the first image link out of an array of image links

我正在尝试找出一种方法来使用 vba 中的 xmlhttp 请求从 webpage 中获取图像。深入挖掘后,我注意到我可以使用此属性 data-lazy-srcset 访问这些图像。但是,此属性会生成一个图像数组 link。我想要做的是从数组中捕获第一张图像 link。

Sub GetImage()
    Const Url = "https://rasamalaysia.com/grilled-honey-cajun-shrimp/"
    Dim Http As Object, Html As HTMLDocument, oImage As Object
    
    Set Html = New HTMLDocument
    Set Http = CreateObject("MSXML2.XMLHTTP")

    With Http
        .Open "Get", Url, False
        .setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36"
        .send
        Html.body.innerHTML = .responseText
    End With
    
    Set oImage = Html.querySelectorAll("p > img")
    Debug.Print oImage(0).getAttribute("data-lazy-srcset")
End Sub

当前输出:

https://rasamalaysia.com/wp-content/uploads/2021/06/honey-cajun-grilled-shrimp3.jpg 1200w, https://rasamalaysia.com/wp-content/uploads/2021/06/honey-cajun-grilled-shrimp3-200x300.jpg 200w, https://rasamalaysia.com/wp-content/uploads/2021/06/honey-cajun-grilled-shrimp3-300x450.jpg 300w, https://rasamalaysia.com/wp-content/uploads/2021/06/honey-cajun-grilled-shrimp3-768x1152.jpg 768w, https://rasamalaysia.com/wp-content/uploads/2021/06/honey-cajun-grilled-shrimp3-1024x1536.jpg 1024w

预期输出(第一个):

https://rasamalaysia.com/wp-content/uploads/2021/06/honey-cajun-grilled-shrimp3.jpg

How can I scrape the first image link out of an array of image links?

你把问题描述的很好,它至少看起来像一个简单的数组索引问题。

将字符串按空格分割成数组,并取出第一个元素。

添加到声明顶部

Dim varArray as Variant

然后添加行

' Split into an array using blank spaces as delimiter
varArray = Split(oImage(0).getAttribute("data-lazy-srcset"), " ")
' This should return your first image
Debug.Print varArray(0)

有一种更高效、更快速的方法。简单地 select by size-full class,对于不需要拆分字符串的元素,您可以直接从属性中直接提取为适当的图像:

Option Explicit

Sub GetImage()
    Const Url = "https://rasamalaysia.com/grilled-honey-cajun-shrimp/"
    Dim Http As Object, Html As HTMLDocument

    Set Html = New HTMLDocument
    Set Http = CreateObject("MSXML2.XMLHTTP")

    With Http
        .Open "Get", Url, False
        .setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36"
        .send
        Html.body.innerHTML = .responseText
    End With
    
    Debug.Print Html.querySelector(".size-full").getAttribute("data-pin-media")

End Sub