如何从 Google 趋势中提取 titles/text 并通过 Selenium 和 Python 打印它们
How to extract titles/text from Google Trends and print them through Selenium and Python
我想从该网站的每一行中提取不同的标题:
https://trends.google.com/trends/trendingsearches/realtime?geo=AR&category=all
我尝试了几次都没有成功。我认为通过 class 搜索元素我会得到所需的文本:
from selenium import webdriver
driver=webdriver.Chrome('path to bin')
driver.get('https://trends.google.com/trends/trendingsearches/realtime?geo=AR&category=all')
hrefs = driver.find_elements_by_class_name('title')
print hrefs
print(len(hrefs))
driver.quit()
提前谢谢大家!
琼
你太亲密了!你只需要从标题中获取文本,试试这个:
from selenium import webdriver
driver=webdriver.Chrome('path to bin')
driver.get('https://trends.google.com/trends/trendingsearches/realtime?geo=AR&category=all')
Titles = driver.find_elements_by_class_name('title')
for title in Titles:
print(title.text)
driver.quit()
@PixelEinstein 的回答可以完美满足你的要求。但作为最佳实践的一部分,您应该始终最大化 浏览器window并诱导WebDriverWait 让 元素可见 首先提取其中的文本,如下所示:
代码块:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_argument('disable-infobars')
driver=webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get('https://trends.google.com/trends/trendingsearches/realtime?geo=AR&category=all')
titles = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='title']")))
for title in titles:
print(title.text)
driver.quit()
控制台输出:
Mauricio Macri • Cyst • Pancreas
Abortion • National Congress of Argentina • Debate
Abortion • Mayra Mendoza • Argentine Chamber of Deputies • Deputy
我想从该网站的每一行中提取不同的标题:
https://trends.google.com/trends/trendingsearches/realtime?geo=AR&category=all
我尝试了几次都没有成功。我认为通过 class 搜索元素我会得到所需的文本:
from selenium import webdriver
driver=webdriver.Chrome('path to bin')
driver.get('https://trends.google.com/trends/trendingsearches/realtime?geo=AR&category=all')
hrefs = driver.find_elements_by_class_name('title')
print hrefs
print(len(hrefs))
driver.quit()
提前谢谢大家! 琼
你太亲密了!你只需要从标题中获取文本,试试这个:
from selenium import webdriver
driver=webdriver.Chrome('path to bin')
driver.get('https://trends.google.com/trends/trendingsearches/realtime?geo=AR&category=all')
Titles = driver.find_elements_by_class_name('title')
for title in Titles:
print(title.text)
driver.quit()
@PixelEinstein 的回答可以完美满足你的要求。但作为最佳实践的一部分,您应该始终最大化 浏览器window并诱导WebDriverWait 让 元素可见 首先提取其中的文本,如下所示:
代码块:
from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC options = webdriver.ChromeOptions() options.add_argument("start-maximized") options.add_argument('disable-infobars') driver=webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe') driver.get('https://trends.google.com/trends/trendingsearches/realtime?geo=AR&category=all') titles = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='title']"))) for title in titles: print(title.text) driver.quit()
控制台输出:
Mauricio Macri • Cyst • Pancreas Abortion • National Congress of Argentina • Debate Abortion • Mayra Mendoza • Argentine Chamber of Deputies • Deputy