IndexError: list index out of range / AttributeError: 'list' object has no attribute 'get_attribute'

Question

我正在尝试从 Instagram 抓取 like/views。但是出现以下错误：

IndexError：列表索引超出范围：--- when i try viewcth[0].get_attribute('innerHTML')

AttributeError: 'list' 对象没有属性 'get_attribute' :--- 当我尝试 viewct = viewcth.get_attribute('innerHTML')

代码：

viewcth = bdy.find_elements_by_xpath(".//*[@class='eo2As ']//*[@class='EDfFK ygqzn']//*[@class='Nm9Fw']")
if (len(viewcth) != 0):
    viewct = viewcth[0].get_attribute('innerText')
else:
    viewcth = bdy.find_elements_by_xpath(".//*[@class='eo2As ']//*[@class='HbPOm _9Ytll']//*[@class='vcOH2']")
    # viewct = viewcth.get_attribute('innerHTML')
    viewct = viewcth[0].get_attribute('innerHTML')
pagedict['viewcount'] = viewct
print("Viewct is " + viewct)

Answer 1

见 viewcth 是 Python 中的列表。

并且您已声明：

AttributeError: 'list' object has no attribute 'get_attribute' :--- when i try viewct = viewcth.get_attribute('innerHTML')

所以当你不能在列表上使用 get_attribute() 时。但是在你的代码中我看到，你没有使用 viewcth.get_attribute('innerHTML') 而是

viewcth[0].get_attribute('innerHTML')

我认为是对的。

我用了一个计数器来计算评论总数。

示例代码：

counter = 0
viewcth = driver.find_elements_by_xpath(".//*[@class='eo2As ']//*[@class='EDfFK ygqzn']//*[@class='Nm9Fw']")
if (len(viewcth) != 0):
    abc = viewcth[0].get_attribute('innerHTML')
    counter = counter  + 1
else:
    viewcth2 = driver.find_elements_by_xpath(".//*[@class='eo2As ']//*[@class='HbPOm _9Ytll']//*[@class='vcOH2']")
    cde = viewcth2[0].get_attribute('innerHTML')
    counter = counter + 1

print("total view  is ",  counter)

更新 1：

你可以试试下面的 xpath :

只获取数字：

//textarea/ancestor::section/preceding-sibling::section[1]/descendant::span

代码：

wait = WebDriverWait(driver, 10)
print(wait.until(EC.element_to_be_clickable((By.XPATH, "//textarea/ancestor::section/preceding-sibling::section[1]/descendant::span"))).text)

进口：

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Answer 2

您使用了错误的定位器。
这就是 viewcth 是空列表的原因。
find_elements_by_xpath(".//*[@class='eo2As ']//*[@class='EDfFK ygqzn']//*[@class='Nm9Fw']") 找不到匹配的元素和 returns 空列表。
因此，当您尝试通过 viewcth[0] 从该列表中获取第一个元素时，您将获得

IndexError: list index out of range.

如果您尝试执行 viewct = viewcth.get_attribute('innerHTML') 这会给您

AttributeError: 'list' object has no attribute 'get_attribute'

因为 viewcth 是一个列表。空的，但仍然是一个列表。
所以你不能在列表上应用 .get_attribute('innerHTML') 方法，它不是网络元素。
如果你想获得类似的数量，试试这个：
图片

likes = bdy.find_element_by_xpath(".//a[@class='zV_Nj']/span").text

对于视频：

likes = bdy.find_element_by_xpath(".//div[@class='Nm9Fw']/a").text

或

likes = bdy.find_element_by_xpath(".//div[@class='HbPOm _9Ytll']/span").text

IndexError: list index out of range / AttributeError: 'list' object has no attribute 'get_attribute'

IndexError: list index out of range / AttributeError: 'list' object has no attribute 'get_attribute'

python

selenium

web-scraping

instagram

selenium-webdriver