我如何使用 Selenium 和 Python 从定位元素中抓取文本
How can i scrape the text from a located element using Selenium and Python
我正在尝试运行以下代码
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time
options = Options()
options.add_argument("start-maximized")
options.add_argument("disable-infobars")
options.add_argument("--disable-extensions")
driver = webdriver.Chrome(options=options)
driver.get('https://theunderminejournal.com/#eu/draenor/battlepet/1155')
time.sleep(20) #bypass cloudflare
price = driver.find_element_by_xpath('//*[@id="battlepet-page"]/div[1]/table/tr[3]/td/span')
print (price)
所以我可以从页面上抓取 "Current Price"。但是这个 xpath 位置不会 return 文本值(我最后也尝试了 "text" 变体但没有成功。
提前感谢您的回复
您应该在获取文本之前等待元素可见。在下面的示例中检查 WebDriverWait
:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as ec
from selenium.webdriver.support.ui import WebDriverWait
rom selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument("start-maximized")
options.add_argument("disable-infobars")
options.add_argument("--disable-extensions")
driver = webdriver.Chrome(options=options)
wait = WebDriverWait(driver, 20)
driver.get('https://theunderminejournal.com/#eu/draenor/battlepet/1155')
current_price = wait.until(ec.visibility_of_element_located((By.CSS_SELECTOR, ".current-price .price"))).text
print(current_price)
首先,使用 WebdriverWait 等待元素而不是休眠。
其次,您的定位器未找到该元素。
试试这个,
driver.get('https://theunderminejournal.com/#eu/draenor/battlepet/1155')
price = WebDriverWait(driver,30).until(EC.visibility_of_element_located((By.XPATH,"//div[@id='battlepet-page']/div/table/tr[@class='current-price']/td/span")))
print(price.text)
要使用 wait 导入以下内容,
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
从 webpage you need to induce WebDriverWait for the visibility_of_element_located()
and you can use either of the following 中获取 当前价格 的值:
使用CSS_SELECTOR
:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "tr.current-price td>span"))).text)
使用XPATH
:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//th[text()='Current Price']//following::td[1]/span"))).text)
注意:您必须添加以下导入:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
我正在尝试运行以下代码
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time
options = Options()
options.add_argument("start-maximized")
options.add_argument("disable-infobars")
options.add_argument("--disable-extensions")
driver = webdriver.Chrome(options=options)
driver.get('https://theunderminejournal.com/#eu/draenor/battlepet/1155')
time.sleep(20) #bypass cloudflare
price = driver.find_element_by_xpath('//*[@id="battlepet-page"]/div[1]/table/tr[3]/td/span')
print (price)
所以我可以从页面上抓取 "Current Price"。但是这个 xpath 位置不会 return 文本值(我最后也尝试了 "text" 变体但没有成功。
提前感谢您的回复
您应该在获取文本之前等待元素可见。在下面的示例中检查 WebDriverWait
:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as ec
from selenium.webdriver.support.ui import WebDriverWait
rom selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument("start-maximized")
options.add_argument("disable-infobars")
options.add_argument("--disable-extensions")
driver = webdriver.Chrome(options=options)
wait = WebDriverWait(driver, 20)
driver.get('https://theunderminejournal.com/#eu/draenor/battlepet/1155')
current_price = wait.until(ec.visibility_of_element_located((By.CSS_SELECTOR, ".current-price .price"))).text
print(current_price)
首先,使用 WebdriverWait 等待元素而不是休眠。
其次,您的定位器未找到该元素。
试试这个,
driver.get('https://theunderminejournal.com/#eu/draenor/battlepet/1155')
price = WebDriverWait(driver,30).until(EC.visibility_of_element_located((By.XPATH,"//div[@id='battlepet-page']/div/table/tr[@class='current-price']/td/span")))
print(price.text)
要使用 wait 导入以下内容,
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
从 webpage you need to induce WebDriverWait for the visibility_of_element_located()
and you can use either of the following
使用
CSS_SELECTOR
:print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "tr.current-price td>span"))).text)
使用
XPATH
:print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//th[text()='Current Price']//following::td[1]/span"))).text)
注意:您必须添加以下导入:
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC