使用 selenium 获取当前视频标签 URL
Getting current video tag URL with selenium
我正在尝试使用 selenium(使用 python 绑定)获取当前 html5 视频标签 URL:
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('https://www.youtube.com/watch?v=9x6YclsLHN0')
video = driver.find_element_by_tag_name('video')
url = driver.execute_script("return arguments[0].currentSrc;", video)
print url
driver.quit()
问题是 url
值打印为空 。为什么会这样,我该如何解决?
我怀疑这是因为在初始化视频标签之前执行脚本并返回currentSrc
值。我尝试添加一个 Explicit Wait,但仍然打印出一个空字符串:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
wait = WebDriverWait(driver, 5)
video = wait.until(EC.visibility_of_element_located((By.TAG_NAME, 'video')))
这让我觉得 我需要这样做 . May be listening for the media events 并等待 video
开始播放。
我也很确定 currentSrc
应该可以工作,因为如果我在控制台中执行代码并手动等待视频开始 - 我看到它正在打印视频 currentSrc
属性值.
仅供参考,还尝试了 java 绑定,结果相同,空字符串:
WebDriver driver = new ChromeDriver();
driver.get("https://www.youtube.com/watch?v=9x6YclsLHN0");
WebElement video = driver.findElement(By.tagName("video"));
JavascriptExecutor js = (JavascriptExecutor) driver;
String url = (String) js.executeScript("return arguments[0].currentSrc;", video);
System.out.println(url);
根据 W3 video tag specification:
The currentSrc DOM attribute is initially the empty string. Its value
is changed by the resource selection algorithm.
这解释了问题中描述的行为。这也意味着要可靠地获取 currentSrc
值,我们需要 等到媒体资源定义了它 。
订阅 loadstart
media event through execute_async_script()
成功了:
driver.set_script_timeout(10)
url = driver.execute_async_script("""
var video = arguments[0],
callback = arguments[arguments.length - 1];
video.addEventListener('loadstart', listener);
function listener() {
callback(video.currentSrc);
};
""", video)
print(url)
我正在尝试使用 selenium(使用 python 绑定)获取当前 html5 视频标签 URL:
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('https://www.youtube.com/watch?v=9x6YclsLHN0')
video = driver.find_element_by_tag_name('video')
url = driver.execute_script("return arguments[0].currentSrc;", video)
print url
driver.quit()
问题是 url
值打印为空 。为什么会这样,我该如何解决?
我怀疑这是因为在初始化视频标签之前执行脚本并返回currentSrc
值。我尝试添加一个 Explicit Wait,但仍然打印出一个空字符串:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
wait = WebDriverWait(driver, 5)
video = wait.until(EC.visibility_of_element_located((By.TAG_NAME, 'video')))
这让我觉得 我需要这样做 video
开始播放。
我也很确定 currentSrc
应该可以工作,因为如果我在控制台中执行代码并手动等待视频开始 - 我看到它正在打印视频 currentSrc
属性值.
仅供参考,还尝试了 java 绑定,结果相同,空字符串:
WebDriver driver = new ChromeDriver();
driver.get("https://www.youtube.com/watch?v=9x6YclsLHN0");
WebElement video = driver.findElement(By.tagName("video"));
JavascriptExecutor js = (JavascriptExecutor) driver;
String url = (String) js.executeScript("return arguments[0].currentSrc;", video);
System.out.println(url);
根据 W3 video tag specification:
The currentSrc DOM attribute is initially the empty string. Its value is changed by the resource selection algorithm.
这解释了问题中描述的行为。这也意味着要可靠地获取 currentSrc
值,我们需要 等到媒体资源定义了它 。
订阅 loadstart
media event through execute_async_script()
成功了:
driver.set_script_timeout(10)
url = driver.execute_async_script("""
var video = arguments[0],
callback = arguments[arguments.length - 1];
video.addEventListener('loadstart', listener);
function listener() {
callback(video.currentSrc);
};
""", video)
print(url)