如何抓取 Google People Also Ask with Selenium 和 Python 的问题和答案，以获得超过 Google 默认输出的数量？

Question

我找到了一个很好的，但它适用于 Google 默认给出的问题和答案的数量，但例如我需要更多。

我是 Python 的开发新手。如何获得更多问题和答案？我是否必须先实现点击以显示所需金额然后再解析？

Answer 1

以下代码解析屏幕上出现的问题，然后询问您是否要解析更多问题。如果您输入 y 然后它会点击最后一个问题的按钮，以便在页面中加载更多问题。问题存储在列表 questions 中，答案存储在列表 answers

中

import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service

your_path = '...'
driver = webdriver.Chrome(service=Service(your_path))

driver.get('https://www.google.com/search?q=How%20to%20make%20bakery%3F&source=hp&ei=j0aZYYjRAvja2roPrcWcyAU&iflsig=ALs-wAMAAAAAYZlUn4NMUPjfIpQmrXSmjIDnaWjJXWIJ&ved=0ahUKEwjI1JDn0Kf0AhV4rVYBHa0iB1kQ4dUDCAc&uact=5&oq=How%20to%20make%20bakery%3F&gs_lcp=Cgdnd3Mtd2l6EAMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBNQAFgAYJMDaABwAHgAgAF-iAF-kgEDMC4xmAEAoAECoAEB&sclient=gws-wiz')

questions, answers = [], []
while 1:
    for idx,question in enumerate(driver.find_elements(By.CSS_SELECTOR, "div[id*='RELATED_QUESTION']")):
        if idx >= len(questions): # skip already parsed questions
            questions.append(question.text)
            txt = ''
            for answer in question.find_elements(By.CSS_SELECTOR, "div[id*='WEB_ANSWERS_RESULT']"):
                txt += answer.get_attribute('innerText')
            answers.append(txt)
    inp = input(f'{idx+1} questions parsed, continue? (y/n)')
    if inp == 'y':
        question.click()
        time.sleep(2)
    else:
        break

如何抓取 Google People Also Ask with Selenium 和 Python 的问题和答案，以获得超过 Google 默认输出的数量？

How to crawl question and answer of Google People Also Ask with Selenium and Python for a quantity that is more than the default output of Google?

python

selenium

parsing