Selenium WebDriver:Firefox 没有通过标签名称获取元素
Selenium WebDriver: Firefox not getting element by tag name
我想 运行 在 Python 中使用 Selenium WebDriver 的 Firefox headless。
重点是进入一个页面,等到JavaScript加载完毕,收集这个页面的所有链接。
为了开始测试,我做了这个代码:
import time
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
options = Options()
options.add_argument("--headless")
url = "http://localhost:3000/"
driver = webdriver.Firefox(firefox_options=options)
driver.get(url)
time.sleep(5)
urls = driver.find_elements_by_tag_name('a')
print(urls)
driver.quit()
这总是会出现以下错误:
Traceback (most recent call last):
File "sel.py", line 18, in <module>
urls = driver.find_elements_by_tag_name('a')
File "/home/.local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 545, in find_elements_by_tag_name
return self.find_elements(by=By.TAG_NAME, value=name)
File "/home/.local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 995, in find_elements
'value': value})['value'] or []
File "/home/.local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 318, in execute
response = self.command_executor.execute(driver_command, params)
File "/home/petra/.local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 472, in execute
return self._request(command_info[0], url, body=data)
File "/home/petra/.local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 496, in _request
resp = self._conn.getresponse()
File "/usr/lib/python2.7/httplib.py", line 1136, in getresponse
response.begin()
File "/usr/lib/python2.7/httplib.py", line 453, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.7/httplib.py", line 417, in _read_status
raise BadStatusLine(line)
httplib.BadStatusLine: ''
我尝试删除这一行 time.sleep(5)
因为我认为这可能是问题所在。
现在print(urls)
returns如下:
[<selenium.webdriver.firefox.webelement.FirefoxWebElement (session="27257d43-81ec-48e4-9ed2-55709a23d60f", element="e728d5ef-001f-4335-bd57-19a1f2d82683")>, <selenium.webdriver.firefox.webelement.FirefoxWebElement (session="27257d43-81ec-48e4-9ed2-55709a23d60f", element="2c59c828-8557-48cc-a79a-02ea3c9d2d65")>, <selenium.webdriver.firefox.webelement.FirefoxWebElement (session="27257d43-81ec-48e4-9ed2-55709a23d60f", element="e2058a00-9bad-4f0c-8e2d-a236a567dddd")>]
如果我将 time.sleep(0)
放入 time.sleep(4)
,就会出现此输出。
不管怎样,这都不是我想要的输出;我想在我的页面上看到所有锚点。
我做错了什么?
抱歉,我是新手。
试试下面的代码:
from selenium.webdriver.support import ui
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
urls = ui.WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.TAG_NAME, "a")))
for url in urls:
print(url.get_attribute("href"))
# Another example of printing URLs (where actual_urls is a list of anchors).
actual_urls = [url.get_attribute("href") for url in urls]
print(actual_urls)
希望对您有所帮助!
我想 运行 在 Python 中使用 Selenium WebDriver 的 Firefox headless。
重点是进入一个页面,等到JavaScript加载完毕,收集这个页面的所有链接。
为了开始测试,我做了这个代码:
import time
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
options = Options()
options.add_argument("--headless")
url = "http://localhost:3000/"
driver = webdriver.Firefox(firefox_options=options)
driver.get(url)
time.sleep(5)
urls = driver.find_elements_by_tag_name('a')
print(urls)
driver.quit()
这总是会出现以下错误:
Traceback (most recent call last):
File "sel.py", line 18, in <module>
urls = driver.find_elements_by_tag_name('a')
File "/home/.local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 545, in find_elements_by_tag_name
return self.find_elements(by=By.TAG_NAME, value=name)
File "/home/.local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 995, in find_elements
'value': value})['value'] or []
File "/home/.local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 318, in execute
response = self.command_executor.execute(driver_command, params)
File "/home/petra/.local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 472, in execute
return self._request(command_info[0], url, body=data)
File "/home/petra/.local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 496, in _request
resp = self._conn.getresponse()
File "/usr/lib/python2.7/httplib.py", line 1136, in getresponse
response.begin()
File "/usr/lib/python2.7/httplib.py", line 453, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.7/httplib.py", line 417, in _read_status
raise BadStatusLine(line)
httplib.BadStatusLine: ''
我尝试删除这一行 time.sleep(5)
因为我认为这可能是问题所在。
现在print(urls)
returns如下:
[<selenium.webdriver.firefox.webelement.FirefoxWebElement (session="27257d43-81ec-48e4-9ed2-55709a23d60f", element="e728d5ef-001f-4335-bd57-19a1f2d82683")>, <selenium.webdriver.firefox.webelement.FirefoxWebElement (session="27257d43-81ec-48e4-9ed2-55709a23d60f", element="2c59c828-8557-48cc-a79a-02ea3c9d2d65")>, <selenium.webdriver.firefox.webelement.FirefoxWebElement (session="27257d43-81ec-48e4-9ed2-55709a23d60f", element="e2058a00-9bad-4f0c-8e2d-a236a567dddd")>]
如果我将 time.sleep(0)
放入 time.sleep(4)
,就会出现此输出。
不管怎样,这都不是我想要的输出;我想在我的页面上看到所有锚点。
我做错了什么?
抱歉,我是新手。
试试下面的代码:
from selenium.webdriver.support import ui
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
urls = ui.WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.TAG_NAME, "a")))
for url in urls:
print(url.get_attribute("href"))
# Another example of printing URLs (where actual_urls is a list of anchors).
actual_urls = [url.get_attribute("href") for url in urls]
print(actual_urls)
希望对您有所帮助!