有没有办法从某个页面下载音频
Is there any way to download the audio from a certain page
我正在使用 python 编写 selenium 脚本,想下载来自某个页面的音频。
页面如下所示:
页面的HTML代码:
<html>
<head>
<meta name="viewport" content="width=device-width">
</head>
<body>
<video controls="" autoplay="" name="media">
<source src="https://website//id=47c484fc7f8f" type="audio/mp3">
</video>
</body>
</html>
到目前为止我的代码:
from seleniumwire import webdriver
import sys
from webdriver_manager.chrome import ChromeDriverManager
import time
import pyaudio
import wave
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--headless")
# for linux/Ubuntu only
#chrome_options.add_argument("--no-sandbox")
browser = webdriver.Chrome(ChromeDriverManager().install(), chrome_options=chrome_options)
browser.get("website")
search = browser.find_element_by_id("text-area")
search.clear()
text = input("text here : ")
search.send_keys(text)
#print(data)
time.sleep(2)
browser.find_element_by_id("btn").click()
# Access and print requests via the `requests` attribute
for request in browser.requests:
if request.response and request.url.__contains__('website//id'):
browser.get(request.url)
我愿意使用任何语言来实现目标
您不需要为此使用 Selenium,requests 库就足够了。您必须为您的 post 请求提供一个唯一标识符作为 sessionID,以便您可以在下一个获取请求中获取生成的文件。
以下面的代码片段为例,它将生成的文件保存在提供的 sessionID 名称下。
import requests
sessionID = '78aa8dd0-9529-11eb-a8b3-0242ac130003'
payload = {'ssmlText': '<prosody pitch=\"default\" rate=\"-0%\">Roses are red, violets are blue</prosody>', 'sessionID': sessionID}
r1 = requests.post("https://www.ibm.com/demos/live/tts-demo/api/tts/store", data = payload)
r1.raise_for_status()
print(r1.status_code, r1.reason)
tts_url = 'https://www.ibm.com/demos/live/tts-demo/api/tts/newSynthesize?voice=en-US_OliviaV3Voice&id=' + sessionID
try:
r2 = requests.get(tts_url, timeout = 10, cookies = r1.cookies)
print(r2.status_code, r2.reason)
try:
with open(sessionID + '.mp3', "w+b") as f:
f.write(r2.content)
except IOError:
print("IOError: could not write a file")
except requests.exceptions.Timeout as err:
print("Timeout: could not get response from the server")
我正在使用 python 编写 selenium 脚本,想下载来自某个页面的音频。
页面如下所示:
页面的HTML代码:
<html>
<head>
<meta name="viewport" content="width=device-width">
</head>
<body>
<video controls="" autoplay="" name="media">
<source src="https://website//id=47c484fc7f8f" type="audio/mp3">
</video>
</body>
</html>
到目前为止我的代码:
from seleniumwire import webdriver
import sys
from webdriver_manager.chrome import ChromeDriverManager
import time
import pyaudio
import wave
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--headless")
# for linux/Ubuntu only
#chrome_options.add_argument("--no-sandbox")
browser = webdriver.Chrome(ChromeDriverManager().install(), chrome_options=chrome_options)
browser.get("website")
search = browser.find_element_by_id("text-area")
search.clear()
text = input("text here : ")
search.send_keys(text)
#print(data)
time.sleep(2)
browser.find_element_by_id("btn").click()
# Access and print requests via the `requests` attribute
for request in browser.requests:
if request.response and request.url.__contains__('website//id'):
browser.get(request.url)
我愿意使用任何语言来实现目标
您不需要为此使用 Selenium,requests 库就足够了。您必须为您的 post 请求提供一个唯一标识符作为 sessionID,以便您可以在下一个获取请求中获取生成的文件。
以下面的代码片段为例,它将生成的文件保存在提供的 sessionID 名称下。
import requests
sessionID = '78aa8dd0-9529-11eb-a8b3-0242ac130003'
payload = {'ssmlText': '<prosody pitch=\"default\" rate=\"-0%\">Roses are red, violets are blue</prosody>', 'sessionID': sessionID}
r1 = requests.post("https://www.ibm.com/demos/live/tts-demo/api/tts/store", data = payload)
r1.raise_for_status()
print(r1.status_code, r1.reason)
tts_url = 'https://www.ibm.com/demos/live/tts-demo/api/tts/newSynthesize?voice=en-US_OliviaV3Voice&id=' + sessionID
try:
r2 = requests.get(tts_url, timeout = 10, cookies = r1.cookies)
print(r2.status_code, r2.reason)
try:
with open(sessionID + '.mp3', "w+b") as f:
f.write(r2.content)
except IOError:
print("IOError: could not write a file")
except requests.exceptions.Timeout as err:
print("Timeout: could not get response from the server")