Pytube 给出各种正则表达式错误

Pytube is giving assorted regex errors

我有一个程序可以用 python 下载 YouTube 视频。程序是:

from pytube import YouTube

def Download(video_list):
    for url in video_list:
        print("url: [%s]" % url)
        youtube = YouTube(url)
        youtube.streams.get_highest_resolution().download("C:/Users/user1/Downloads")
        
        print (f'{youtube.title} downloaded.')
    
video_list = []

run = True

while run == True:
    link = str(input("Enter youtube URL, or press D to download: "))
    
    if link.find("youtu") != 1:
        video_list.append(link)
    if link == 'd' or link == 'D':
        Download(video_list)
        run = False
    elif link.find("youtu") == -1:
        print ("Invalid youtube URL.")

我尝试下载一个视频 (https://www.youtube.com/watch?v=jikcB7_gj8A),但出现了这个错误:

url: [https://www.youtube.com/watch?v=jikcB7_gj8A]
Traceback (most recent call last):
  File "c:\Users\user1\Desktop\Mostly Python\youtube_downloader.py", line 27, in <module>
  File "c:\Users\user1\Desktop\Mostly Python\youtube_downloader.py", line 8, in Download
    youtube = YouTube(url)
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\__main__.py", line 71, in __init__        
    self.video_id = extract.video_id(url)
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\extract.py", line 133, in video_id        
    return regex_search(r"(?:v=|\/)([0-9A-Za-z_-]{11}).*", url, group=1)
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\helpers.py", line 129, in regex_search    
    raise RegexMatchError(caller="regex_search", pattern=pattern)
pytube.exceptions.RegexMatchError: regex_search: could not find match for (?:v=|\/)([0-9A-Za-z_-]{11}).*

所以我尝试了解决方案 here(将 cipher.py 中的 regexes 替换为 'r'\bc\s*&&\s*d\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(''),现在我收到此错误:

url: [https://www.youtube.com/watch?v=jikcB7_gj8A]
Traceback (most recent call last):
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\__main__.py", line 181, in fmt_streams
    extract.apply_signature(stream_manifest, self.vid_info, self.js)
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\extract.py", line 409, in apply_signature 
    cipher = Cipher(js=js)
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\cipher.py", line 29, in __init__
    self.transform_plan: List[str] = get_transform_plan(js)
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\cipher.py", line 186, in get_transform_plan
    return regex_search(pattern, js, group=1).split(";")
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\helpers.py", line 129, in regex_search    
    raise RegexMatchError(caller="regex_search", pattern=pattern)
pytube.exceptions.RegexMatchError: regex_search: could not find match for qra\[0\]=function\(\w\){[a-z=\.\(\"\)]*;(.*);(?:.+)}    

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "c:\Users\user1\Desktop\Mostly Python\youtube_downloader.py", line 27, in <module>
    Download(video_list)
  File "c:\Users\user1\Desktop\Mostly Python\youtube_downloader.py", line 9, in Download
    youtube.streams.get_highest_resolution().download("C:/Users/user1/Downloads")
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\__main__.py", line 296, in streams        
    return StreamQuery(self.fmt_streams)
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\__main__.py", line 188, in fmt_streams    
    extract.apply_signature(stream_manifest, self.vid_info, self.js)
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\extract.py", line 409, in apply_signature 
    cipher = Cipher(js=js)
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\cipher.py", line 29, in __init__
    self.transform_plan: List[str] = get_transform_plan(js)
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\cipher.py", line 186, in get_transform_plan
    return regex_search(pattern, js, group=1).split(";")
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\helpers.py", line 129, in regex_search    
    raise RegexMatchError(caller="regex_search", pattern=pattern)
pytube.exceptions.RegexMatchError: regex_search: could not find match for qra\[0\]=function\(\w\){[a-z=\.\(\"\)]*;(.*);(?:.+)} 

我很茫然。我该怎么办?

编辑:我是运行 pytube 版本12.0.0,下载自https://github.com/nficano/pytube

我修改了你的代码,在我的系统上没有出现任何错误。

import tldextract
from pytube import YouTube


def Download(video_list):
    for url in video_list:
        youtube = YouTube(url.strip())
        youtube.streams.get_highest_resolution().download()
        print(f'The video - {youtube.title} - was downloaded.')

video_list = []

run = True

while run == True:
     link = str(input("Enter a YouTube URL or press D to download the video(s): "))
    domain_name = tldextract.extract(link).domain
    if link.lower() == 'd':
        Download(video_list)
        run = False
    elif domain_name == 'youtube':
        video_list.append(link)
    else:
        print("Invalid youtube URL.")

Enter a YouTube URL or press D to download the video(s): https://www.youtube.com/watch?v=jikcB7_gj8A
Enter a YouTube URL or press D to download the video(s): D

The video - How Teachers Help You During Tests #Shorts - was downloaded.

Process finished with exit code 0

显然没有提供任何解决方案,但供您参考。(我不能发表评论,我的名气不够大。)

自 4 月 13 日以来,我在多个用户和我的设备上看到了这个问题。它在 4 月 13 日之前一直运行良好。

您的代码或 pytube 本身没有任何问题。 YouTube 最终更改了模式,因此在 pytube 中运行的旧模式不再适用于 URL。

有很多修复,例如您尝试过的修复 cipher.py 文件、更改函数模式和使用 git 版本,但没有一个对我有用。 因此,唯一的修复方法是检查更多修复程序(如果有更多修复程序可用)或等待 pytube 更新更改。还有其他应用程序,如 pytube、yt-dlp 和 Youtube-dl.

在cypher.py修改第273行,更改正则表达式。 注意:{3} 现在是 {2}。这应该至少在它再次更改之前有效。

第 273 行:r'([a-z]\s*=\s*([a-zA-Z0-9$]{2})([\d+])? ([a-z])',

同时编辑第 288 行

第 288 行:nfunc=re.escape(function_match.group(1))),