如何使用 Python Selenium 从 HTML DOM 打印更多链接?
How can I print more links from the HTML DOM using Python Selenium?
Html:
<div class="xxxx">
<a href="ooooo.pdf"></a>
</div>
Python 硒代码试验:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
print(wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div.xxxx a"))).get_attribute('href'))
输出:
aaaaa.pdf
如何打印 ooooo.pdf
和 aaaaa.pdf
?
我想打印更多链接,怎么办?
returns 单个 WebElement 因此仅打印第一个匹配元素的 href
属性。
解决方案
要提取所有 href
属性值,您必须归纳 WebDriverWait for visibility_of_all_elements_located() and you can use either of the following :
使用CSS_SELECTOR:
print([my_elem.get_attribute("href") for my_elem in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.xxxx a")))])
使用 XPATH:
print([my_elem.get_attribute("href") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='xxxx']//a")))])
备选
作为替代方案,您也可以尝试:
使用CSS_SELECTOR:
print([my_elem.get_attribute("href") for my_elem in driver.find_elements(By.CSS_SELECTOR, "div.xxxx a")])
使用 XPATH:
print([my_elem.get_attribute("href") for my_elem in driver.find_elements(By.XPATH, "//div[@class='xxxx']//a")])
Html:
<div class="xxxx">
<a href="ooooo.pdf"></a>
</div>
Python 硒代码试验:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
print(wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div.xxxx a"))).get_attribute('href'))
输出:
aaaaa.pdf
如何打印 ooooo.pdf
和 aaaaa.pdf
?
我想打印更多链接,怎么办?
href
属性。
解决方案
要提取所有 href
属性值,您必须归纳 WebDriverWait for visibility_of_all_elements_located() and you can use either of the following
使用CSS_SELECTOR:
print([my_elem.get_attribute("href") for my_elem in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.xxxx a")))])
使用 XPATH:
print([my_elem.get_attribute("href") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='xxxx']//a")))])
备选
作为替代方案,您也可以尝试:
使用CSS_SELECTOR:
print([my_elem.get_attribute("href") for my_elem in driver.find_elements(By.CSS_SELECTOR, "div.xxxx a")])
使用 XPATH:
print([my_elem.get_attribute("href") for my_elem in driver.find_elements(By.XPATH, "//div[@class='xxxx']//a")])