Selenium 打印出与预期不同的文本

Selenium printing something different text from expected

我想从剑桥词典中获取美式发音 目前我的代码在这里:

from selenium import webdriver
from YDSData import tdata, dummydata

words = dummydata
link='https://dictionary.cambridge.org/dictionary/english'

driver = webdriver.Chrome()

for word in words:
    driver.get(link+"/"+str(i))
    try:
        result = driver.find_elements_by_xpath('//*[@id="page-content"]/div[2]/div[1]/div[2]/div/div[3]/div/div/div[1]/div[2]/span[2]/span[3]/span')
        print(content)
    except:
      driver.close()

此代码本应为我提供剑桥词典中的美国发音,但它打印出:

[<selenium.webdriver.remote.webelement.WebElement (session="285ac250c8925dd19eb01a7853c1f219", element="6f609e22-76ab-443d-8216-4ac90aefda20")>]
[<selenium.webdriver.remote.webelement.WebElement (session="285ac250c8925dd19eb01a7853c1f219", element="c7d8a664-d162-4d2c-8c87-dd8e10211024")>]
[<selenium.webdriver.remote.webelement.WebElement (session="285ac250c8925dd19eb01a7853c1f219", element="7f163600-d25f-4000-893f-44693736ed41")>]

另外,它非常慢是代码有问题吗?

编辑:

我重写了答案部分的代码以解决问题,现在代码如下所示:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from YDSData import tdata, dummydata

driver = webdriver.Chrome()

driver.get("https://dictionary.cambridge.org/dictionary/english/dictionary")

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[text()='us']//following-sibling::span[@class='pron dpron']"))).get_attribute("innerHTML"))

它可以工作,但代码需要 8 分钟才能工作并出现以下错误:

[5836:11428:0223/004955.530:ERROR:ssl_client_socket_impl.cc(995)] handshake failed; returned -1, SSL error code 1, net_error -201
[11580:7744:0223/005141.150:ERROR:gpu_init.cc(454)] Passthrough is not supported, GL is disabled, ANGLE is 
[14788:5864:0223/005241.189:ERROR:chrome_browser_main_extra_parts_metrics.cc(227)] START: ReportBluetoothAvailability(). If you don't see the END: message, this is crbug.com/1216328.
[14788:5864:0223/005241.197:ERROR:chrome_browser_main_extra_parts_metrics.cc(230)] END: ReportBluetoothAvailability()
[14788:5864:0223/005241.215:ERROR:chrome_browser_main_extra_parts_metrics.cc(235)] START: GetDefaultBrowser(). If you don't see the END: message, this is crbug.com/1216328.
[14788:5864:0223/005241.243:ERROR:chrome_browser_main_extra_parts_metrics.cc(239)] END: GetDefaultBrowser()
[14788:11300:0223/005241.291:ERROR:device_event_log_impl.cc(214)] [00:52:41.291] USB: usb_device_handle_win.cc:1049 Failed to read descriptor from node connection: A device attached to the system is not functioning. (0x1F)

这些是:

[<selenium.webdriver.remote.webelement.WebElement (session="285ac250c8925dd19eb01a7853c1f219", element="6f609e22-76ab-443d-8216-4ac90aefda20")>]
[<selenium.webdriver.remote.webelement.WebElement (session="285ac250c8925dd19eb01a7853c1f219", element="c7d8a664-d162-4d2c-8c87-dd8e10211024")>]
[<selenium.webdriver.remote.webelement.WebElement (session="285ac250c8925dd19eb01a7853c1f219", element="7f163600-d25f-4000-893f-44693736ed41")>]

当您在控制台上打印它们时


要打印文本 /ˈdɪk.ʃən.er.i/,您可以使用以下

  • 使用 xpath:

    driver.get("https://dictionary.cambridge.org/dictionary/english/dictionary")
    print(driver.find_element(By.XPATH, "//span[text()='us']//following-sibling::span[@class='pron dpron']").text)
    

要提取文本 /dɪk.ʃən.er.i/ 理想情况下,您需要归纳 WebDriverWait for the and you can use the following

  • 使用 XPATH:

    driver.get("https://dictionary.cambridge.org/dictionary/english/dictionary")
    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[text()='us']//following-sibling::span[@class='pron dpron']"))).get_attribute("innerHTML"))
    
  • 控制台输出:

    /ˈdɪk.ʃən.er.i/
    
  • 注意:您必须添加以下导入:

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

You can find a relevant discussion in