有没有办法从通过 JavaScript 嵌入的 YouTube 视频中获取章节?

Is there a way to get CHAPTERS from a YouTube video embedded via JavaScript?

包含这个 YouTube iframe 脚本后 API https://www.youtube.com/iframe_api 可以将 YouTube 视频放入容器中(例如“container_id”),并在其页面元素上使用诸如“seekTo()”和“play()”之类的方法。 (iframe 之外的任何位置)

var player = new YT.Player('container_id', {
    videoId: 'video_id'
});

视频object可以这样访问YT.get("container_id")。通过这样做 YT.get("container_id").seekTo(60) 可以滚动到 1 分钟标记,但我似乎无法在其中找到“章节”object。 “章节”是由这样的时间戳分隔的视频部分。

我在想是否有办法将它们作为数组或 object 或其他东西获取,但似乎无法在 YT.get("container_id") 结果 object 中找到它。

生成的 iFrame 确实将它们包含为 html 标签,但它没有像“开始于”或“章节标题”这样的元数据,而且由于 CORS 恶作剧,它无法真正访问。 YouTube iframe api 是否会发送章节数据(如果存在)? (很多视频都没有)

我认为 API 没有任何章节功能。

章节的工作方式是 they are derived from the timestamps put in the video description因此,虽然这是一种解决方法,但如果您确实想获取视频的章节信息,则可以解析描述以获取时间戳及其名称。

我 运行 在 API 页面上快速按 Ctrl+F 以查看 iFrame API 是否允许您访问视频的描述,这似乎不是案子。不过我可能是错的,在那种情况下请忽略下面的说明。

为了获得描述,如果您有视频 ID,您可以使用 Youtube 数据 API(不同 api,因此您可能需要另一个 api 密钥为此),特别是 list method from the Videos 部分。您可以通过传递“片段”作为“部分”参数的参数来获取描述。

我创建了一个代码,它使用 JavaScript、正则表达式和节点获取 YouTube 章节数据:

const axios = require('axios').default; // You have to install axios (npm install axios) in order to use this code
const youtubeApiKey = 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'; // Your YouTube API Key here

// Function for filtering array elements which don't contain a timestamp
const notText = (array) => {
    const regexTimePattern = /\d:\d/;
    for (let i = 0; i < array.length; i++) {
        let result = regexTimePattern.exec(array[i]);
        if (result === null) {
            array[i] = "";
        }
    }
}

const main = async (videoId) => {
    const videoDataResponse = await axios.get(
        `https://youtube.googleapis.com/youtube/v3/videos?part=snippet&id=${videoId}&key=${youtubeApiKey}`,
        { headers: { 'Accept': 'application/json' } }
    );
    const description = JSON.stringify(videoDataResponse.data.items[0].snippet.description); //DESCRIÇÃO DO VÍDEO

    // Find [number]:[number] pattern in description
    const numberNumberPattern = /\d:\d/gi;
    let descriptionLines = [], numberNumbeResult;
    while ((numberNumbeResult = numberNumberPattern.exec(description))) {
        descriptionLines.push(numberNumbeResult.index);
    }

    let min = [], sec = [], hour = [], chapterTitle = [], chapterStartIndex, chapterEndIndex;

    // Verifies if last [number]:[number] correspondence is in the last description line
    if (description.indexOf("\n", descriptionLines[descriptionLines.length - 1]) === -1) { //é na última linha
        chapterEndIndex = description.length - 1;
    } else { // not in the last line
        chapterEndIndex = description.indexOf("\n", descriptionLines[descriptionLines.length - 1]);
    }

    // Verifies if first [number]:[number] correspondence is in the first description line
    switch (descriptionLines[0]) {
        case 1: // it's in the first line ([number]:[number] pattern)
            chapterStartIndex = 1;
            break;
        case 2: // it's in the first line ([number][number]:[number] pattern)
            chapterStartIndex = 2;
            break;
        default: //it's not in the first line
            const auxiliarString = description.substring(descriptionLines[0] - 6);
            chapterStartIndex = auxiliarString.indexOf("\n") + descriptionLines[0] - 4;
    }

    // get description part with timestamp and titles
    const chapters = description.substring(chapterStartIndex, chapterEndIndex);

    // separete lines
    const notFilteredLine = chapters.split("\n");

    // filter lines which don't have [number]:[number] pattern
    notText(notFilteredLine);

    // filter empty lines
    const filteredLine = notFilteredLine.filter(Boolean);


    for (let i = 0; i < filteredLine.length; i++) {
        const notNumberOrColonRegex = /[^:\d]/; // not number nor ":"
        const numberRegex = /\d/; // number
        let numberResult = numberRegex.exec(filteredLine[i]);
        filteredLine[i] = filteredLine[i].substring(numberResult.index); // starts line in first number found
        let notNumberOrColonResult = notNumberOrColonRegex.exec(filteredLine[i]);
        let tempo = filteredLine[i].substring(0, notNumberOrColonResult.index); // timestamp end (not number nor ":")
        let tempoSeparado = tempo.split(":"); // split timestamps in each ":"
        switch (tempoSeparado.length) {
            case 2: // doesn't have hour
                hour[i] = 0;
                min[i] = Number(tempoSeparado[0]);
                sec[i] = Number(tempoSeparado[1]);
                break;
            case 3: // has hour
                hour[i] = Number(tempoSeparado[0]);
                min[i] = Number(tempoSeparado[1]);
                sec[i] = Number(tempoSeparado[2]);
                break;
        }
        const numberOrLetterRegex = /[a-z0-9]/i; // number or letter 
        const auxiliarString = filteredLine[i].substring(notNumberOrColonResult.index); // auxiliar string starts when not number nor ":"
        let numberOrLetterResult = numberOrLetterRegex.exec(auxiliarString);

        // chapter title starts in first letter or number found in auxiliarString
        chapterTitle[i] = auxiliarString.substring(numberOrLetterResult.index);
    }
    let chaptersData = [];
    for (let i = 0; i < chapterTitle.length; i++) {
        chaptersData[i] = {
            chapterTitle: chapterTitle[i],
            chapterTimestamp: [hour[i], min[i], sec[i]]
        }
    }
    console.log(chaptersData);
  // chaptersData contains the data we are looking for
  // chaptersData is an array of objects, in the format {chapterTitle: 'chapter title example', chapterTimestamp: [01, 25, 39]}
  // chapterTimestamp is in the format [hour, minute, second]
}

// calls main with the ID of the video of interest
main('videoId'); // For example: main('R3WDe7byUXo');

我在一年前创建了这段代码,当时我开始学习 JavaScript,所以我可能用比我需要的更复杂的方式做了一些事情。阅读代码的第一条和最后一条注释以了解如何使用它。如果您有任何疑问或建议,请告诉我。

Link for GitHub repository