如何从亚马逊拉取产品的图片和标题?

How to pull the image and title of the product from Amazon?

我正在尝试根据亚马逊的唯一产品代码制作产品列表。

例如:https://www.amazon.in/gp/product/B00F2GPN36

其中 B00F2GPN36 是唯一代码。

我想将产品的图像和标题提取到产品图像和产品名称列下的 Excel 列表中。

我试过html.getElementsById("productTitle")html.getElementsByTagName

我也怀疑要描述什么样的变量来存储上述信息,因为我已经尝试声明 Object 类型和 HtmlHtmlElement.

我试图提取 html 文档并将其用于数据搜索。

代码:

Enum READYSTATE
     READYSTATE_UNINITIALIZED = 0
     READYSTATE_LOADING = 1
     READYSTATE_LOADED = 2
     READYSTATE_INTERACTIVE = 3
     READYSTATE_COMPLETE = 4
End Enum

Sub parsehtml()

     Dim ie As InternetExplorer
     Dim topics As Object
     Dim html As HTMLDocument

     Set ie = New InternetExplorer
     ie.Visible = False
     ie.navigate "https://www.amazon.in/gp/product/B00F2GPN36"

     Do While ie.READYSTATE <> READYSTATE_COMPLETE
       Application.StatusBar = "Trying to go to Amazon.in...."
       DoEvents    
     Loop

     Application.StatusBar = ""
     Set html = ie.document
     Set topics = html.getElementsById("productTitle")
     Sheets(1).Cells(1, 1).Value = topics.innerText
     Set ie = Nothing

End Sub

我希望单元格 A1 中的输出为:
"Milton Thermosteel Carafe Flask, 2 litres, Silver" 应该反映(不带引号),同样我也想拉图像。

但总是会出现一些错误,例如:
1. Run-time 错误 '13':
我使用 "Dim topics As HTMLHtmlElement"
时类型不匹配 2. Run-time 错误 '438':
Object 不支持此 属性 或方法

注意:我从 工具 > 参考资料 添加了参考资料,即所需的库。

vba中没有html.getElementsById("productTitle")这样的东西。 ID 始终是唯一的,因此它应该是 html.getElementById("productTitle")。 运行 获取它们的以下脚本:

Sub ParseHtml()
    Dim IE As New InternetExplorer, elem As Object
    Dim Html As HTMLDocument, imgs As Object

    With IE
        .Visible = False
        .navigate "https://www.amazon.in/gp/product/B00F2GPN36"
        While .Busy Or .readyState < 4: DoEvents: Wend
        Set Html = .document
    End With

    Set elem = Html.getElementById("productTitle")
    Set imgs = Html.getElementById("landingImage")

    Sheets(1).Cells(1, 1) = elem.innerText
    Sheets(1).Cells(1, 1).Offset(0, 1) = imgs.getAttribute("data-old-hires")
End Sub

更快的方法是使用 xhr 并避免使用浏览器并将结果从数组写出到 sheet

Option Explicit
Public Sub GetInfo()
    Dim html As HTMLDocument, results()
    Set html = New HTMLDocument
    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", "https://www.amazon.in/gp/product/B00F2GPN36", False
        .send
        html.body.innerHTML = .responseText
        With html
            results = Array(.querySelector("#productTitle").innerText, .querySelector("#landingImage").getAttribute("data-old-hires"))
        End With
    End With
    With ThisWorkbook.Worksheets("Sheet1")
        .Cells(1, 1) = results(0)
        Dim file As String
        file = DownloadFile("C:\Users\User\Desktop\", results(1))  'your path to download file
        With .Pictures.Insert(file)
            .Left = ThisWorkbook.Worksheets("Sheet1").Cells(1, 2).Left
            .Top = ThisWorkbook.Worksheets("Sheet1").Cells(1, 2).Top
            .Width = 75
            .Height = 100
            .Placement = 1
        End With
    End With
    Kill file
End Sub