如何在 Python 3 中使用 Selenium 从网站的某个部分获取文本

Question

我想知道如何使用 Selenium 和 Python 从网站中提取文本 3. 我不知道文本是什么，所以我不能只查找句子并复制它。这是一个示例屏幕截图：Example Problem. Know in this scenario I am looking for the small amount of text right after the 1. but it is represented by just ::header, so I am having trouble grabbing it. Any ideas? Thanks! Also the website I am pulling from is Quia.

谢谢！

Answer 1

很难直接回答，因为这个网络例子是在登录后的。从广义上讲，您可以使用 xpath 表达式，它需要有关 xml/html 树的信息（例如，当使用 Chrome 或 Firefox 时，在 PC 键盘上的 F12 按钮下可用。contex 鼠标菜单中的“检查”也是这种方式）。在同一服务器的登录页面上获取欢迎文本的示例：

from selenium import webdriver
from selenium.webdriver.common.by import By

def s_obj(sel_drv, xph):
    return sel_drv.find_elements(by=By.XPATH, value = f"{xph}")

def s_text(sel_drv, xph):
    els = s_obj(sel_drv, xph)
    return '; '.join(el.text.replace('\n', '; ')\
        for el in els).strip(';').strip() if els else ''

test_url = "https://www.quia.com/web"

sel_drv = webdriver.Chrome()
sel_drv.get(test_url)
bs_xph = "//*/table/tbody/tr/td[@colspan=\"5\"]/h1[@class=\"home\"]"
expected_txt = s_text(sel_drv, f"{bs_xph}[1]")
print(expected_txt)
sel_drv.quit()

如何在 Python 3 中使用 Selenium 从网站的某个部分获取文本

How to get text from a section of a website using Selenium in Python 3

python

selenium

web-scraping