有没有办法从某个页面下载音频

Is there any way to download the audio from a certain page

我正在使用 python 编写 selenium 脚本,想下载来自某个页面的音频。

页面如下所示:

页面的HTML代码:

<html>
<head>
<meta name="viewport" content="width=device-width">
</head>
<body>
<video controls="" autoplay="" name="media">
<source src="https://website//id=47c484fc7f8f" type="audio/mp3">
</video>
</body>
</html>

到目前为止我的代码:

from seleniumwire import webdriver 
import sys
from webdriver_manager.chrome import ChromeDriverManager
import time
import pyaudio
import wave
from selenium.webdriver.chrome.options import Options



chrome_options = Options()
chrome_options.add_argument("--headless")
# for linux/Ubuntu only
#chrome_options.add_argument("--no-sandbox") 


browser = webdriver.Chrome(ChromeDriverManager().install(), chrome_options=chrome_options)
browser.get("website")
search = browser.find_element_by_id("text-area")
search.clear()
    
text = input("text here : ")
search.send_keys(text)
#print(data)
time.sleep(2)
browser.find_element_by_id("btn").click()

# Access and print requests via the `requests` attribute
for request in browser.requests:
    if request.response and request.url.__contains__('website//id'):
        browser.get(request.url)

我愿意使用任何语言来实现目标

您不需要为此使用 Selenium,requests 库就足够了。您必须为您的 post 请求提供一个唯一标识符作为 sessionID,以便您可以在下一个获取请求中获取生成的文件。

以下面的代码片段为例,它将生成的文件保存在提供的 sessionID 名称下。

import requests

sessionID = '78aa8dd0-9529-11eb-a8b3-0242ac130003'
payload = {'ssmlText': '<prosody pitch=\"default\" rate=\"-0%\">Roses are red, violets are blue</prosody>', 'sessionID': sessionID} 

r1 = requests.post("https://www.ibm.com/demos/live/tts-demo/api/tts/store", data = payload)
r1.raise_for_status()

print(r1.status_code, r1.reason)

tts_url = 'https://www.ibm.com/demos/live/tts-demo/api/tts/newSynthesize?voice=en-US_OliviaV3Voice&id=' + sessionID

try: 
    r2 = requests.get(tts_url, timeout = 10, cookies = r1.cookies)
    print(r2.status_code, r2.reason)

    try: 
        with open(sessionID + '.mp3', "w+b") as f: 
            f.write(r2.content)
    except IOError:
        print("IOError: could not write a file")
    
except requests.exceptions.Timeout as err: 
    print("Timeout: could not get response from the server")