尽管使用了正确的 CSS selector/XPATH 我还是找不到元素,而且 html 中没有我正在抓取的 iframe。我如何获得元素?

I can't locate an element despite using correct CSS selector/XPATH and there is no iframe in html that I'm scraping. How do I get the element?

下面是我的全部代码供参考。除倒数第二行外,一切正常,这就是我的问题。在这里。

from selenium import webdriver
import os
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import xlsxwriter
from datetime import datetime
import time

trade_date_lim = "4/10/2021"


chrome_driver = os.path.abspath('C:/Users/ross/Desktop/chromedriver.exe')
browser = webdriver.Chrome(chrome_driver)
browser.get('https://finra-markets.morningstar.com/BondCenter/Default.jsp')

WebDriverWait(browser, 5).until(EC.element_to_be_clickable((By.CSS_SELECTOR, '#TabContainer > div > div.rtq-tab-wrap > div.rtq-tab-menus-wrap > ul > li:nth-child(3) > a > span'))).click()
WebDriverWait(browser, 5).until(EC.element_to_be_clickable((By.CSS_SELECTOR, '#firscreener-cusip'))).send_keys("STWD")
WebDriverWait(browser, 5).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#ms-finra-advanced-search-form > div.ms-finra-advanced-search-btn > input:nth-child(2)"))).click()
WebDriverWait(browser, 5).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#ms-agreement > input"))).click()

WebDriverWait(browser, 5).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-grid-hd > div > div:nth-child(7) > div"))).click()
time.sleep(2)
WebDriverWait(browser, 5).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-grid-hd > div > div:nth-child(7) > div"))).click()
time.sleep(2)
whole_chart = WebDriverWait(browser, 5).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-scrollpanel > div.rtq-grid-scroll"))).text


parent = browser.find_element_by_xpath('//*[@id="ms-finra-search-results"]/div/div[3]/div[1]/div[1]/div[2]/div[2]/div')
count_divs = len(parent.find_elements_by_xpath("./div"))



for row_num in range(1):

    #gets values that I'm looking for
    symbol = WebDriverWait(browser, 5).until(EC.presence_of_element_located((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-scrollpanel > div.rtq-grid-scroll > div > div:nth-child(" + str(row_num + 1) + ") > div:nth-child(3)"))).text
    maturity = WebDriverWait(browser, 5).until(EC.presence_of_element_located((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-scrollpanel > div.rtq-grid-scroll > div > div:nth-child(" + str(row_num + 1) + ") > div:nth-child(7)"))).text
    moody_rating = WebDriverWait(browser, 5).until(EC.presence_of_element_located((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-scrollpanel > div.rtq-grid-scroll > div > div:nth-child(" + str(row_num + 1) + ") > div:nth-child(8)"))).text
    sandp_rating = WebDriverWait(browser, 5).until(EC.presence_of_element_located((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-scrollpanel > div.rtq-grid-scroll > div > div:nth-child(" + str(row_num + 1) + ") > div:nth-child(9)"))).text
    bond_yield = WebDriverWait(browser, 5).until(EC.presence_of_element_located((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-scrollpanel > div.rtq-grid-scroll > div > div:nth-child(" + str(row_num + 1) + ") > div:nth-child(11)"))).text

    #looks to see if all values are non-empty and if moody rating and sandp rating are not equal to 'WR' and 'NR'
    if symbol.strip() and maturity.strip() and moody_rating.strip() and sandp_rating.strip() and bond_yield.strip() and moody_rating != "WR" and sandp_rating != "NR":
        WebDriverWait(browser, 5).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-scrollpanel > div.rtq-grid-scroll > div > div:nth-child(" + str(row_num + 1) + ") > div:nth-child(2) > div > a"))).click()
        WebDriverWait(browser, 5).until(EC.frame_to_be_available_and_switch_to_it((By.ID, "ms-bond-detail-iframe")))
        WebDriverWait(browser, 5).until(EC.element_to_be_clickable((By.CSS_SELECTOR,"#tradeHistory_link"))).click()
        browser.switch_to.default_content()
        time.sleep(10)
        #bond information has everything we need. Now we check to see the last time this bond was actually traded
        last_trade_date = WebDriverWait(browser, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, '#ms-glossary > div > table > tbody > tr:nth-child(1) > td:nth-child(1) > div')))
        print(last_trade_date)

引发的错误是超时异常。

我尝试通过 CSS 选择器和 XPATH 查找。我相信我正在为每条路径使用正确的格式。我在 html 中找不到 iframe,所以我不必担心。我包含隐式等待 time.sleep(10) 只是为了确保网页已通过搜索完全加载。为了更好地衡量,我包括了 visibility_of_element_located 的显式等待。我还尝试使用 presence_of_element_locatedelement_to_be_clickable。我要疯了,有人能帮忙吗?

罗斯

有 2 个问题...

首先,改变这个:

browser.switch_to.default_content()

为此:

browser.switch_to.window(browser.window_handles[-1])

切换到 default_content 仅在使用 iFrame 时使用,此处并非如此。 browser.switch_to.window(browser.window_handles[-1]) 切换到上次打开的标签页

其次,你的最后一行应该是:

print(last_trade_date.text)

而不是:

print(last_trade_date)

打印:

1/15/2021

顺便说一句,我不认为 time.sleep(10) 是必要的,我把它完全拿出来 运行 很好

我上次对你进行了 if 块处理。问题是您在单击

时打开了一个新选项卡
WebDriverWait(driver, 5).until(EC.element_to_be_clickable((By.CSS_SELECTOR,"#tradeHistory_link"))).click()

因此您需要将网络驱动程序的焦点更改到该新选项卡:

driver.switch_to.window(new_window)

代码:

#looks to see if all values are non-empty and if moody rating and sandp rating are not equal to 'WR' and 'NR'
    if symbol.strip() and maturity.strip() and moody_rating.strip() and sandp_rating.strip() and bond_yield.strip() and moody_rating != "WR" and sandp_rating != "NR":
        WebDriverWait(driver, 5).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#ms-finra-search-results > div > div.qs-resultData > div.qs-resultData-body > div.rtq-grid.rtq-grid-auto-h > div.rtq-scrollpanel > div.rtq-grid-scroll > div > div:nth-child(" + str(row_num + 1) + ") > div:nth-child(2) > div > a"))).click()
        WebDriverWait(driver, 5).until(EC.frame_to_be_available_and_switch_to_it((By.ID, "ms-bond-detail-iframe")))
        windows_before  = driver.current_window_handle
        WebDriverWait(driver, 5).until(EC.element_to_be_clickable((By.CSS_SELECTOR,"#tradeHistory_link"))).click()
        WebDriverWait(driver, 10).until(EC.number_of_windows_to_be(2))
        windows_after = driver.window_handles
        new_window = [x for x in windows_after if x != windows_before][0]
        driver.switch_to.window(new_window)
        #bond information has everything we need. Now we check to see the last time this bond was actually traded
        #new_window = [x for x in window_after if x != window_before][0]
        #driver.switch_to.window(new_window)
        sleep(5)
        last_trade_date = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "#ms-glossary > div > table > tbody > tr:nth-child(1) > td:nth-child(1) > div")))
        print(last_trade_date.text)

O/P :

1/15/2021

Process finished with exit code 0

我还建议不要为每个操作创建 WebDriverWait 对象。相反,您可以这样做:

wait = WebDriverWait(driver, 30)

现在像下面这样到处使用等待:

wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "#ms-glossary > div > table > tbody > tr:nth-child(1) > td:nth-child(1) > div")))

通过这种方式你可以优化你的代码。您的 space 复杂性会降低。