如何使用 Selenium 提取 src 属性的值
How to extract the value of the src attribute using Selenium
我什至连名字都不知道的东西有问题。
我正在尝试使用硒到达 src= "LINK"
旁边的 link。
我有class name = tWeCI
,我想我需要用它来实现,但我不知道该怎么做。
代码试验:
from selenium import webdriver
from selenium.webdriver.common.by import By
browser = webdriver.Chrome()
browser.get("https://www.instagram.com/p/CYMHMEGBSRT/")
a = browser.find_element(by=By.CLASS_NAME, value="tWeCl")
输出:
https://instagram.fist4-1.fna.fbcdn.net/v/t50.2886-16/271301182_1082110899295465_6686845989868801609_n.mp4?efg=eyJ2ZW5jb2RlX3RhZyI6InZ0c192b2RfdXJsZ2VuLjcyMC5jbGlwcy5iYXNlbGluZSIsInFlX2dyb3VwcyI6IltcImlnX3dlYl9kZWxpdmVyeV92dHNfb3RmXCJdIn0&_nc_ht=instagram.fist4-1.fna.fbcdn.net&_nc_cat=101&_nc_ohc=6RJahYZ39DIAX_ul9E4&edm=AABBvjUBAAAA&vs=453007466354843_3354420889&_nc_vs=HBksFQAYJEdENjZLeERwbE1LVExOZ0RBRWtPNE5YLWNzeGNicV9FQUFBRhUAAsgBABUAGCRHS0ZaS1JCMldIUXZxRzhCQUJ4bFB2ZnVrbE1hYnFfRUFBQUYVAgLIAQAoABgAGwAVAAAmqN%2FFrpms4z8VAigCQzMsF0A%2BmZmZmZmaGBJkYXNoX2Jhc2VsaW5lXzFfdjERAHX%2BBwA%3D&_nc_rid=c8bb85b1e6&ccb=7-4&oe=62408107&oh=00_AT-E77LrFH7G6VVXGeh8bRFQsu95hhQlqUGZdinzFBIUYQ&_nc_sid=83d603"
HTML 的快照:
要打印 src 属性的值,您必须引入 for the and you can use either of the following :
使用CSS_SELECTOR:
driver.get('https://www.instagram.com/p/CYMHMEGBSRT/')
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "video.tWeCl"))).get_attribute("src"))
使用 XPATH:
driver.get('https://www.instagram.com/p/CYMHMEGBSRT/')
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//video[@class='tWeCl']"))).get_attribute("src"))
控制台输出:
https://instagram.fpnq13-2.fna.fbcdn.net/v/t50.2886-16/271301182_1082110899295465_6686845989868801609_n.mp4?efg=eyJ2ZW5jb2RlX3RhZyI6InZ0c192b2RfdXJsZ2VuLjcyMC5jbGlwcy5iYXNlbGluZSIsInFlX2dyb3VwcyI6IltcImlnX3dlYl9kZWxpdmVyeV92dHNfb3RmXCJdIn0&_nc_ht=instagram.fpnq13-2.fna.fbcdn.net&_nc_cat=101&_nc_ohc=6RJahYZ39DIAX-Iphu9&edm=AABBvjUBAAAA&vs=453007466354843_3354420889&_nc_vs=HBksFQAYJEdENjZLeERwbE1LVExOZ0RBRWtPNE5YLWNzeGNicV9FQUFBRhUAAsgBABUAGCRHS0ZaS1JCMldIUXZxRzhCQUJ4bFB2ZnVrbE1hYnFfRUFBQUYVAgLIAQAoABgAGwAVAAAmqN%2FFrpms4z8VAigCQzMsF0A%2BmZmZmZmaGBJkYXNoX2Jhc2VsaW5lXzFfdjERAHX%2BBwA%3D&_nc_rid=34a08d5a01&ccb=7-4&oe=62408107&oh=00_AT9sh_BH__zjeReDB7lde4t3avzYqDjimTJRnoZi6Lj-TQ&_nc_sid=83d603
注意:您必须添加以下导入:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
我什至连名字都不知道的东西有问题。
我正在尝试使用硒到达 src= "LINK"
旁边的 link。
我有class name = tWeCI
,我想我需要用它来实现,但我不知道该怎么做。
代码试验:
from selenium import webdriver
from selenium.webdriver.common.by import By
browser = webdriver.Chrome()
browser.get("https://www.instagram.com/p/CYMHMEGBSRT/")
a = browser.find_element(by=By.CLASS_NAME, value="tWeCl")
输出:
https://instagram.fist4-1.fna.fbcdn.net/v/t50.2886-16/271301182_1082110899295465_6686845989868801609_n.mp4?efg=eyJ2ZW5jb2RlX3RhZyI6InZ0c192b2RfdXJsZ2VuLjcyMC5jbGlwcy5iYXNlbGluZSIsInFlX2dyb3VwcyI6IltcImlnX3dlYl9kZWxpdmVyeV92dHNfb3RmXCJdIn0&_nc_ht=instagram.fist4-1.fna.fbcdn.net&_nc_cat=101&_nc_ohc=6RJahYZ39DIAX_ul9E4&edm=AABBvjUBAAAA&vs=453007466354843_3354420889&_nc_vs=HBksFQAYJEdENjZLeERwbE1LVExOZ0RBRWtPNE5YLWNzeGNicV9FQUFBRhUAAsgBABUAGCRHS0ZaS1JCMldIUXZxRzhCQUJ4bFB2ZnVrbE1hYnFfRUFBQUYVAgLIAQAoABgAGwAVAAAmqN%2FFrpms4z8VAigCQzMsF0A%2BmZmZmZmaGBJkYXNoX2Jhc2VsaW5lXzFfdjERAHX%2BBwA%3D&_nc_rid=c8bb85b1e6&ccb=7-4&oe=62408107&oh=00_AT-E77LrFH7G6VVXGeh8bRFQsu95hhQlqUGZdinzFBIUYQ&_nc_sid=83d603"
HTML 的快照:
要打印 src 属性的值,您必须引入
使用CSS_SELECTOR:
driver.get('https://www.instagram.com/p/CYMHMEGBSRT/') print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "video.tWeCl"))).get_attribute("src"))
使用 XPATH:
driver.get('https://www.instagram.com/p/CYMHMEGBSRT/') print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//video[@class='tWeCl']"))).get_attribute("src"))
控制台输出:
https://instagram.fpnq13-2.fna.fbcdn.net/v/t50.2886-16/271301182_1082110899295465_6686845989868801609_n.mp4?efg=eyJ2ZW5jb2RlX3RhZyI6InZ0c192b2RfdXJsZ2VuLjcyMC5jbGlwcy5iYXNlbGluZSIsInFlX2dyb3VwcyI6IltcImlnX3dlYl9kZWxpdmVyeV92dHNfb3RmXCJdIn0&_nc_ht=instagram.fpnq13-2.fna.fbcdn.net&_nc_cat=101&_nc_ohc=6RJahYZ39DIAX-Iphu9&edm=AABBvjUBAAAA&vs=453007466354843_3354420889&_nc_vs=HBksFQAYJEdENjZLeERwbE1LVExOZ0RBRWtPNE5YLWNzeGNicV9FQUFBRhUAAsgBABUAGCRHS0ZaS1JCMldIUXZxRzhCQUJ4bFB2ZnVrbE1hYnFfRUFBQUYVAgLIAQAoABgAGwAVAAAmqN%2FFrpms4z8VAigCQzMsF0A%2BmZmZmZmaGBJkYXNoX2Jhc2VsaW5lXzFfdjERAHX%2BBwA%3D&_nc_rid=34a08d5a01&ccb=7-4&oe=62408107&oh=00_AT9sh_BH__zjeReDB7lde4t3avzYqDjimTJRnoZi6Lj-TQ&_nc_sid=83d603
注意:您必须添加以下导入:
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC