如何从 youtube-dl 的播放列表中的 Youtube 视频中提取上传日期、标题、URL 和持续时间?

How to Extract The upload dates, Titles, URLs and Durations from Youtube videos in a Playlist with youtube-dl?

我正在尝试从具有 youtube-dl 的特定播放列表的所有 Youtube 视频中提取 Upload DatesTitlesURLsDurations , 我不需要视频 - 只需要上面的数据。

到目前为止,我已经测试了 在这里建议的以下两种方法:

Youtube-dl's GitHub Doc Used as reference

The Playlist URL used for testing

方法#1

youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD > example.json

方法#2

youtube-dl --get-upload_date https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD > example.txt

方法 #1 输出整个 json 转储——每个视频大约 3000 行——处理大量 Youtube 视频播放列表非常不方便——但它包含 4需要的数据。

APPROACH #2 returns 出现以下错误:

youtube-dl: error: no such option: --get-upload_date

我想支持 APPROACH #2 以将输出数据限制为仅需要的数据(upload datesTitlesURLsDurations),在 and after checking the upload_date is a valid youtube-dl option here Youtube-dl's GitHub Doc Used as reference.

之后

为什么 upload_data 选项没有得到验证?

有什么办法可以限制数据?

非常感谢您的有益建议。

这是 json 转储文件: example.json


编辑(感谢@PIERPY 伟大的指导 - 完整记录的免费流程 - 对他人有帮助):


我安装成功了Chocolatey NuGet with Admin CMD to install jq 1.5 with chocolatey install jq as required by Download jq - Windows

我的 Chocolatey NuGet 安装输出:

    Microsoft Windows [Version 10.0.19042.867]
(c) 2020 Microsoft Corporation. All rights reserved.
C:\WINDOWS\system32>@"%SystemRoot%\System32\WindowsPowerShell\v1.0\powershell.exe" -NoProfile -InputFormat None -ExecutionPolicy Bypass -Command "iex ((New-Object System.Net.WebClient).DownloadString('https://community.chocolatey.org/install.ps1'))" && SET "PATH=%PATH%;%ALLUSERSPROFILE%\chocolatey\bin"                                                         
Forcing web requests to allow TLS v1.2 (Required for requests to Chocolatey.org)                                        
Getting latest version of the Chocolatey package for download.                                                          
Not using proxy.
Getting Chocolatey from https://community.chocolatey.org/api/v2/package/chocolatey/0.10.15.
Downloading https://community.chocolatey.org/api/v2/package/chocolatey/0.10.15 to C:\Users\###\AppData\Local\Temp\chocolatey\chocoInstall\chocolatey.zip
Not using proxy.
Extracting C:\Users\###\AppData\Local\Temp\chocolatey\chocoInstall\chocolatey.zip to C:\Users\###\AppData\Local\Temp\chocolatey\chocoInstall
Installing Chocolatey on the local machine
Creating ChocolateyInstall as an environment variable (targeting 'Machine')
  Setting ChocolateyInstall to 'C:\ProgramData\chocolatey'
WARNING: It's very likely you will need to close and reopen your shell
  before you can use choco.
Restricting write permissions to Administrators
We are setting up the Chocolatey package repository.
The packages themselves go to 'C:\ProgramData\chocolatey\lib'
  (i.e. C:\ProgramData\chocolatey\lib\yourPackageName).
A shim file for the command line goes to 'C:\ProgramData\chocolatey\bin'
  and points to an executable in 'C:\ProgramData\chocolatey\lib\yourPackageName'.

Creating Chocolatey folders if they do not already exist.

WARNING: You can safely ignore errors related to missing log files when
  upgrading from a version of Chocolatey less than 0.9.9.
  'Batch file could not be found' is also safe to ignore.
  'The system cannot find the file specified' - also safe.
chocolatey.nupkg file not installed in lib.
 Attempting to locate it from bootstrapper.
PATH environment variable does not have C:\ProgramData\chocolatey\bin in it. Adding...
WARNING: Not setting tab completion: Profile file does not exist at 'C:\Users\###\Documents\WindowsPowerShell\Microsoft.PowerShell_profile.ps1'.
Chocolatey (choco.exe) is now ready.
You can call choco from anywhere, command line or powershell by typing choco.
Run choco /? for a list of functions.
You may need to shut down and restart powershell and/or consoles
 first prior to using choco.
Ensuring Chocolatey commands are on the path
Ensuring chocolatey.nupkg is in the lib folder

C:\WINDOWS\system32>

然后我运行chocolatey install jq安装成功:

我的jq安装输出:

    C:\WINDOWS\system32>chocolatey install jq
Chocolatey v0.10.15
Installing the following packages:
jq
By installing you accept licenses for the packages.
Progress: Downloading jq 1.6... 100%

jq v1.6 [Approved]
jq package files install completed. Performing other installation steps.
The package jq wants to run 'chocolateyinstall.ps1'.
Note: If you don't run this script, the installation will fail.
Note: To confirm automatically next time, use '-y' or consider:
choco feature enable -n allowGlobalConfirmation
Do you want to run the script?([Y]es/[A]ll - yes to all/[N]o/[P]rint): Y

Downloading jq 64 bit
  from 'https://github.com/stedolan/jq/releases/download/jq-1.6/jq-win64.exe'
Progress: 100% - Completed download of C:\ProgramData\chocolatey\lib\jq\tools\jq.exe (3.36 MB).
Download of jq.exe (3.36 MB) completed.
Hashes match.
C:\ProgramData\chocolatey\lib\jq\tools\jq.exe
 ShimGen has successfully created a shim for jq.exe
 The install of jq was successful.
  Software install location not explicitly set, could be in package or
  default install location if installer.

Chocolatey installed 1/1 packages.
 See the log for details (C:\ProgramData\chocolatey\logs\chocolatey.log).

我然后运行你的@pierpyyoutube-dl命令:

youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq '{"date": .upload_date,"title": .title,"URL": .url,"duration": .duration}'

并出现语法错误,输出为:

    Microsoft Windows [Version 10.0.19042.867]
(c) 2020 Microsoft Corporation. All rights reserved.

C:\Users\###>cd documents

C:\Users\###\Documents>cd youtube-dl

C:\Users\###\Documents\youtube-dl>youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq '{"date": .upload_date,"title": .title,"URL": .url,"duration": .duration}'
jq: error: syntax error, unexpected INVALID_CHARACTER, expecting $end (Windows cmd shell quoting issues?) at <top-level>, line 1:
'{date:
jq: 1 compile error
Traceback (most recent call last):
  File "__main__.py", line 19, in <module>
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\__init__.py", line 475, in main
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\__init__.py", line 465, in _real_main
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 2060, in download
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 799, in extract_info
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 806, in wrapper
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 838, in __extract_info
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 924, in process_ie_result
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 1058, in __process_playlist
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 806, in wrapper
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 1068, in __process_iterable_entry
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 910, in process_ie_result
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 872, in process_ie_result
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 1683, in process_video_result
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 1793, in process_info
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 1765, in __forced_printings
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 520, in to_stdout
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\YoutubeDL.py", line 509, in _write_string
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpwt56m8wg\build\youtube_dl\utils.py", line 3180, in write_string
OSError: [Errno 22] Invalid argument

C:\Users\###\Documents\youtube-dl>

然后我用谷歌搜索了错误

jq: error: syntax error, unexpected INVALID_CHARACTER, expecting $end (Windows cmd shell quoting issues?)

并从这个建议中找到了见解:

It's all about the quoting

然后我相应地调整了你的 @pierpy youtube-dl 命令单引号到双引号:

youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq "{"date": .upload_date,"title": .title,"URL": .url,"duration": .duration}"

现在它根据需要输出数据Upload DatesTitlesURLsDurations

最终输出:

C:\Users\###\Documents\youtube-dl>youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq "{"date": .upload_date,"title": .title,"URL": .url,"duration": .duration}"
{
  "date": "20150717",
  "title": "3.1: Flow (setup and draw) - Processing Tutorial",
  "URL": "https://r1---sn-n0ogpnx-b85s.googlevideo.com/videoplayback?expire=1617730292&ei=lEZsYKDoEZmAp-oP3ayk8AI&ip=188.154.162.181&id=o-AHFxnOR5c5xqmgtu1JG4FbL6lJW0gz1pJQN77cr2-27T&itag=22&source=youtube&requiressl=yes&mh=m6&mm=31%2C29&mn=sn-n0ogpnx-b85s%2Csn-1gieen7e&ms=au%2Crdu&mv=m&mvi=1&pl=23&initcwndbps=1578750&vprv=1&mime=video%2Fmp4&ns=r3pR-nwt6FkDQa33iQQu-qgF&ratebypass=yes&dur=944.007&lmt=1607684088067796&mt=1617708538&fvip=5&fexp=24001373%2C24007246&beids=9466585&c=WEB&txp=5432434&n=3P6HQoLfY8ktFLG5&sparams=expire%2Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cvprv%2Cmime%2Cns%2Cratebypass%2Cdur%2Clmt&sig=AOq0QJ8wRgIhAMiNOv8QDjfsn7yxicEOtSjcEYjZlX3CfrI8D-HGBd63AiEA4E6rKv_kYti6rAeieJzPAdTYjoh05Az_11Kcxt-0jBg%3D&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRAIgD43F71OxMExfQyN9FeNWfZX_aiGAD3SKlKOLNR14NT8CICEuD_Ry0oymKZmFfHuP4F6v9MKCrmRI0x27sLG8fvyG",
  "duration": 944
}
{
  "date": "20150717",
  "title": "3.2: Built-in Variables (mouseX, mouseY) - Processing Tutorial",
  "URL": "https://r4---sn-n0ogpnx-b85l.googlevideo.com/videoplayback?expire=1617730293&ei=lEZsYMO2OczSWaPiueAC&ip=188.154.162.181&id=o-ANuT73vsKQLvQqynOeh00stVP-zqbq3x-iUrdDiYwg8E&itag=22&source=youtube&requiressl=yes&mh=kE&mm=31%2C29&mn=sn-n0ogpnx-b85l%2Csn-1gieen7e&ms=au%2Crdu&mv=m&mvi=4&pl=23&initcwndbps=1617500&vprv=1&mime=video%2Fmp4&ns=tPtC_l82gq-yi-rk_oQXatAF&cnr=14&ratebypass=yes&dur=814.207&lmt=1551720899437893&mt=1617708538&fvip=5&fexp=24001373%2C24007246&beids=9466585&c=WEB&txp=5432432&n=LhJHXWU8TGNOrD9u&sparams=expire%2Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cvprv%2Cmime%2Cns%2Ccnr%2Cratebypass%2Cdur%2Clmt&sig=AOq0QJ8wRAIgSHTlBPN0j49hoB02SYDeF3-9fe1iSz1KRiv9iFy8nj0CIHEafdAOBefsos8kO5FGhDljsKpOV7ZQ9dY1BEzQQ0n0&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRgIhAJkd-9posqapJekca_35YNG0g3nLgxTfW06EqRM-a3wDAiEApSrsS5wPlMPXjlI_bvOh53cjxlrHfNSKD4XbhyDyZ6w%3D",
  "duration": 815
}
{
  "date": "20150717",
  "title": "3.3: Events (mousePressed, keyPressed) - Processing Tutorial",
  "URL": "https://r4---sn-n0ogpnx-b85l.googlevideo.com/videoplayback?expire=1617730293&ei=lUZsYK6WJ4TeWaeflbgF&ip=188.154.162.181&id=o-AD1WgS46WiFogy00v3aHRp6aZXkd_ACN-_m76lPoQvA8&itag=22&source=youtube&requiressl=yes&mh=it&mm=31%2C29&mn=sn-n0ogpnx-b85l%2Csn-1gieen7e&ms=au%2Crdu&mv=m&mvi=4&pl=23&initcwndbps=1617500&vprv=1&mime=video%2Fmp4&ns=AlyS4uv2BH5ENfp_nP53I-sF&cnr=14&ratebypass=yes&dur=441.225&lmt=1472343659978757&mt=1617708538&fvip=4&fexp=24001373%2C24007246&beids=9466585&c=WEB&n=np6rmmeSKhYEvG1K&sparams=expire%2Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cvprv%2Cmime%2Cns%2Ccnr%2Cratebypass%2Cdur%2Clmt&sig=AOq0QJ8wRgIhAIRmvxmY-VidN3LPhnzCNQ2TLsUB_7i1yU0QOMBVUS6AAiEAm9DE-Kk6cCNb8FC0we4c2O8299n2_2jGnQfzYzz0igo%3D&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRQIgZzrGEwMcb0Vrj9FleanW2apPMu_55OdH2SRdw66DQ1QCIQCDsAz7X5RxczKtWzokBhyUNcyXLXeZF-ENufpjA0BP2Q%3D%3D",
  "duration": 442
}

C:\Users\###\Documents\youtube-dl>

上一期:


获得的 URLs 不显示标准视频。 为什么不呢?

Youtube-dl's GitHub Doc Used as reference 中指出:

url (string): Video URL

如何检索标准的 Youtube 视频 URL?

最后一期答案:

我刚刚查看了昨天生成的 example.json 文件,发现标准的 Youtube 视频网址接受 webpage_url 代替 url


最终 YOUTUBE-DL 输出:


C:\Users\###\Documents\youtube-dl>youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq "{"date": .upload_date,"title": .title,"URL": .webpage_url,"duration": .duration}"
{
  "date": "20150717",
  "title": "3.1: Flow (setup and draw) - Processing Tutorial",
  "URL": "https://www.youtube.com/watch?v=o8dffrZ86gs",
  "duration": 944
}
{
  "date": "20150717",
  "title": "3.2: Built-in Variables (mouseX, mouseY) - Processing Tutorial",
  "URL": "https://www.youtube.com/watch?v=ibW4oA7-n8I",
  "duration": 815
}
{
  "date": "20150717",
  "title": "3.3: Events (mousePressed, keyPressed) - Processing Tutorial",
  "URL": "https://www.youtube.com/watch?v=UvSjtiW-RH8",
  "duration": 442
}

C:\Users\###\Documents\youtube-dl>

在 JSON 文件中获取最终输出:

youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq "{"date": .upload_date,"title": .title,"URL": .webpage_url,"duration": .duration}" > example.json

您需要使用方便的工具过滤输出,例如 jq:
粘贴此命令行:
youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq '{"date": .upload_date,"title": .title,"URL": .url,"duration": .duration}'
您可以从 https://stedolan.github.io/jq/download/

获得 jq

更新:

密钥 "webpage_url" 包含标准的 YouTube 网址,如果需要的话。 有关各种可能键的完整列表,运行:
youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq keys
这给出了原始 JSON.

中的完整键名