使用 urllib 时如何获取 mp4 的名称?

How do grab the name of an mp4 when using urllib?

link.txt 文件包含我循环访问的链接。这些链接指向包含 mp4 文件的页面。我正在下载这些。它工作正常,只是我无法获取 mp4 的原始名称。

mp4 文件的当前输出:

videoname.mp4

mp4 文件的所需输出:

V14728_full_h264_1500.mp4

我的代码:

one = open("link.txt", "r")
for two in one.readlines():
    driver.get(two)
    sleep(2)
    vid = driver.find_element(By.TAG_NAME, "video")
    src = vid.get_attribute("src")
    driver.get(src)
    sleep(2)
    url = driver.current_url
    print(url)
    urllib.request.urlretrieve(url, 'videoname.mp4') #NEED FIX HERE

HTML 的页面:

<html>
   <head>
      <meta name="viewport" content="width=device-width">
      <input type="hidden" id="_w_tusk">
      <script type="text/javascript" src="chrome-extension://dbjbempljhcmhlfpfacalomonjpalpko/scripts/inspector.js">
      </script><script src="chrome-extension://mooikfkahbdckldjjndioackbalphokd/assets/prompt.js"></script>
   </head>
   <body class="vsc-initialized" style="">
      <div class="vsc-controller">      
      </div><video controls="" autoplay="" name="media">
         <source src="https://download2.[REDACTED].com/7eefd14b306c441ba17f2bd72e371586/61cfc9a7/stream/V14728/V14728_vids/V14728_full_h264_1500.mp4" type="video/mp4">
      </video><span id="copylAddress" style="display: inline-block; position: absolute; left: -9999em;">
      </span>
   </body>
</html>

要提取文件名,只需将 url 除以 /,然后从列表中选择最后一个元素:

src="https://download2.[REDACTED].com/7eefd14b306c441ba17f2bd72e371586/61cfc9a7/stream/V14728/V14728_vids/V14728_full_h264_1500.mp4"

src.split('/')[-1]

输出:

V14728_full_h264_1500.mp4

在你的例子中:

urllib.request.urlretrieve(url, src.split('/')[-1])