使用 urllib 时如何获取 mp4 的名称?
How do grab the name of an mp4 when using urllib?
link.txt 文件包含我循环访问的链接。这些链接指向包含 mp4 文件的页面。我正在下载这些。它工作正常,只是我无法获取 mp4 的原始名称。
mp4 文件的当前输出:
videoname.mp4
mp4 文件的所需输出:
V14728_full_h264_1500.mp4
我的代码:
one = open("link.txt", "r")
for two in one.readlines():
driver.get(two)
sleep(2)
vid = driver.find_element(By.TAG_NAME, "video")
src = vid.get_attribute("src")
driver.get(src)
sleep(2)
url = driver.current_url
print(url)
urllib.request.urlretrieve(url, 'videoname.mp4') #NEED FIX HERE
HTML 的页面:
<html>
<head>
<meta name="viewport" content="width=device-width">
<input type="hidden" id="_w_tusk">
<script type="text/javascript" src="chrome-extension://dbjbempljhcmhlfpfacalomonjpalpko/scripts/inspector.js">
</script><script src="chrome-extension://mooikfkahbdckldjjndioackbalphokd/assets/prompt.js"></script>
</head>
<body class="vsc-initialized" style="">
<div class="vsc-controller">
</div><video controls="" autoplay="" name="media">
<source src="https://download2.[REDACTED].com/7eefd14b306c441ba17f2bd72e371586/61cfc9a7/stream/V14728/V14728_vids/V14728_full_h264_1500.mp4" type="video/mp4">
</video><span id="copylAddress" style="display: inline-block; position: absolute; left: -9999em;">
</span>
</body>
</html>
要提取文件名,只需将 url 除以 /
,然后从列表中选择最后一个元素:
src="https://download2.[REDACTED].com/7eefd14b306c441ba17f2bd72e371586/61cfc9a7/stream/V14728/V14728_vids/V14728_full_h264_1500.mp4"
src.split('/')[-1]
输出:
V14728_full_h264_1500.mp4
在你的例子中:
urllib.request.urlretrieve(url, src.split('/')[-1])
link.txt 文件包含我循环访问的链接。这些链接指向包含 mp4 文件的页面。我正在下载这些。它工作正常,只是我无法获取 mp4 的原始名称。
mp4 文件的当前输出:
videoname.mp4
mp4 文件的所需输出:
V14728_full_h264_1500.mp4
我的代码:
one = open("link.txt", "r")
for two in one.readlines():
driver.get(two)
sleep(2)
vid = driver.find_element(By.TAG_NAME, "video")
src = vid.get_attribute("src")
driver.get(src)
sleep(2)
url = driver.current_url
print(url)
urllib.request.urlretrieve(url, 'videoname.mp4') #NEED FIX HERE
HTML 的页面:
<html>
<head>
<meta name="viewport" content="width=device-width">
<input type="hidden" id="_w_tusk">
<script type="text/javascript" src="chrome-extension://dbjbempljhcmhlfpfacalomonjpalpko/scripts/inspector.js">
</script><script src="chrome-extension://mooikfkahbdckldjjndioackbalphokd/assets/prompt.js"></script>
</head>
<body class="vsc-initialized" style="">
<div class="vsc-controller">
</div><video controls="" autoplay="" name="media">
<source src="https://download2.[REDACTED].com/7eefd14b306c441ba17f2bd72e371586/61cfc9a7/stream/V14728/V14728_vids/V14728_full_h264_1500.mp4" type="video/mp4">
</video><span id="copylAddress" style="display: inline-block; position: absolute; left: -9999em;">
</span>
</body>
</html>
要提取文件名,只需将 url 除以 /
,然后从列表中选择最后一个元素:
src="https://download2.[REDACTED].com/7eefd14b306c441ba17f2bd72e371586/61cfc9a7/stream/V14728/V14728_vids/V14728_full_h264_1500.mp4"
src.split('/')[-1]
输出:
V14728_full_h264_1500.mp4
在你的例子中:
urllib.request.urlretrieve(url, src.split('/')[-1])