使用部分 href 信息提取多个文本
Extracting multiple text using partial href information
我正在尝试从以下网站中提取多种流派。 (我已经知道网址)
https://www.discogs.com/master/1515454-Zedd-Katy-Perry-365
<div class="profile">
<h1 id="profile_title" class="hide_mobile has_action)menu">...<h1>
<div class="head">Genre:<div> ==[=10=]
<div class="content">
<a href="/genre/electronic">Electronic</a>
", "
<a href="/genre/pop">Pop</a>
这是我的 Python 代码
genre = None
try:
genre = driver.find_element_by_xpath("[contains(concat(' ', @class, ' '), ' profile ')]//*[contains(@href, ' /genre/* '").text
如何将流派提取到文本中? (例如电子、流行)
提取并打印流派的值,即Electronic
, Pop
, 等等 website you need to induce for visibility_of_all_elements_located() and you can use either of the following :
使用 XPATH:
driver.get("https://www.discogs.com/master/1515454-Zedd-Katy-Perry-365")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button#onetrust-accept-btn-handler"))).click()
print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//tr/th[@scope='row' and contains(., 'Genre')]//following::td[1]//a")))])
控制台输出:
['Electronic', 'Pop']
注意:您必须添加以下导入:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
我正在尝试从以下网站中提取多种流派。 (我已经知道网址) https://www.discogs.com/master/1515454-Zedd-Katy-Perry-365
<div class="profile">
<h1 id="profile_title" class="hide_mobile has_action)menu">...<h1>
<div class="head">Genre:<div> ==[=10=]
<div class="content">
<a href="/genre/electronic">Electronic</a>
", "
<a href="/genre/pop">Pop</a>
这是我的 Python 代码
genre = None
try:
genre = driver.find_element_by_xpath("[contains(concat(' ', @class, ' '), ' profile ')]//*[contains(@href, ' /genre/* '").text
如何将流派提取到文本中? (例如电子、流行)
提取并打印流派的值,即Electronic
, Pop
, 等等 website you need to induce
使用 XPATH:
driver.get("https://www.discogs.com/master/1515454-Zedd-Katy-Perry-365") WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button#onetrust-accept-btn-handler"))).click() print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//tr/th[@scope='row' and contains(., 'Genre')]//following::td[1]//a")))])
控制台输出:
['Electronic', 'Pop']
注意:您必须添加以下导入:
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC