使用部分 href 信息提取多个文本

Extracting multiple text using partial href information

我正在尝试从以下网站中提取多种流派。 (我已经知道网址) https://www.discogs.com/master/1515454-Zedd-Katy-Perry-365

<div class="profile">
  <h1 id="profile_title" class="hide_mobile has_action)menu">...<h1>
  <div class="head">Genre:<div> ==[=10=]
  <div class="content">
    <a href="/genre/electronic">Electronic</a>
    ", "
    <a href="/genre/pop">Pop</a>


这是我的 Python 代码

genre = None
try:
  genre = driver.find_element_by_xpath("[contains(concat(' ', @class, ' '), ' profile ')]//*[contains(@href, ' /genre/* '").text

如何将流派提取到文本中? (例如电子、流行)

提取并打印流派的值,即ElectronicPop, 等等 website you need to induce for visibility_of_all_elements_located() and you can use either of the following :

  • 使用 XPATH:

    driver.get("https://www.discogs.com/master/1515454-Zedd-Katy-Perry-365")
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button#onetrust-accept-btn-handler"))).click()
    print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//tr/th[@scope='row' and contains(., 'Genre')]//following::td[1]//a")))])
    
  • 控制台输出:

    ['Electronic', 'Pop']
    
  • 注意:您必须添加以下导入:

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC