如何抓取有声读物网站对mp3文件的请求url？

Question

网站：

https://www.ting22.com/ting/659-2.html

我想从上面的网站上获得一些有声读物。换句话说，我想从 659-2.html 下载有声读物的 MP3 文件到 659-1724.html。

通过使用F12工具，在[网络]->[媒体]中，我可以看到请求URL的 MP3 文件，但我不知道如何使用脚本获取 URL。

以下是我使用的一些规格：

系统：Windows 7 x64
Python: 3.7.0

更新：

例如使用F12工具，我可以看到文件的url是“http://audio.xmcdn.com/group58/M03/8D/07/wKgLc1zNaabhA__WAEJyyPUT5k4509.mp3”

但是我不知道如何在代码中获取MP3文件的URL？而不是如何下载文件。

我应该使用哪个库？

谢谢。

Answer 1

更新

那会有点复杂，因为请求包不会 return .mp3 源，因此您需要使用 Selenium。这是经过测试的解决方案：

from selenium import webdriver  # pip install selenium
import urllib3
import shutil
import os


if not os.path.exists(os.getcwd()+'/mp3_folder'):
    os.mkdir(os.getcwd()+'/mp3_folder')


def downloadFile(url=None):
    filename = url.split('/')[-1]
    c = urllib3.PoolManager()
    with c.request('GET', url, preload_content=False) as resp, open('mp3_folder/'+filename, 'wb') as out_file:
        shutil.copyfileobj(resp, out_file)
    resp.release_conn()


driver = webdriver.Chrome('chromedriver.exe')  # download chromedriver from here and place it near the script: https://chromedriver.storage.googleapis.com/72.0.3626.7/chromedriver_win32.zip
for i in range(2, 1725):
    try:
        driver.get('https://www.ting22.com/ting/659-%s.html' % i)
        src = driver.find_element_by_id('mySource').get_attribute('src')
        downloadFile(src)
        print(src)
    except Exception as exc:
        print(exc)

如何抓取有声读物网站对mp3文件的请求url？

How to capture the request url of mp3 file on the audiobook website?

python

audio

capture

python-requests

更新