如何接受 Youtube Cookies Consent with Powershell

How to accept Youtube Cookies Consent with Powershell

这几天我无法使用我的 Powershell 脚本从 Youtube 获取任何信息,首先必须接受 cookie 才能看到视频。关于如何使用 Powershell 接受 Youtube Cookies 有什么想法吗?已经有问题了,不过与Powershell无关

$HTML=Invoke-WebRequest -Uri https://www.youtube.com/user/YouTube/videos

经过一些测试,我成功了。

但请记住,如果 Google 更改流程或响应的表单数据,当前流程将来可能会失败。

背景

在我们第一次调用 youtube 后,我们被重定向到 consent.google.com 并收到以下表单内容:

    <form action="https://consent.youtube.com/s" method="POST" style="display:inline;">
        <input type="hidden" name="gl" value="DE">
        <input type="hidden" name="m" value="0">
        <input type="hidden" name="pc" value="yt">
        <input type="hidden" name="continue" value="https://www.youtube.com/user/YouTube/videos">
        <input type="hidden" name="ca" value="r">
        <input type="hidden" name="x" value="8">
        <input type="hidden" name="v" value="cb.20210329-17-p2.de+FX+873">
        <input type="hidden" name="t" value="ADw3F8i44JCpypLjx8SOx3tbsrxxS7ug:1617806186191">
        <input type="hidden" name="hl" value="de">
        <input type="hidden" name="src" value="1">
        <input type="hidden" name="uxe" value="2398321372">
        <input type="submit" value="Ich stimme zu" class="button" aria-label="In die Verwendung von Cookies und anderen Daten zu den beschriebenen Zwecken einwilligen" />
    </form>

我假设只有 continuev 这两个属性很重要,但我只是像我的浏览器一样在我的 POST 进程中使用所有属性。
v 将是我们最终 CONSENT cookie 值的正确部分。它将有一个 YES+ 前缀。
例如 v 中的 cb.20210329-17-p2.de+FX+873 在 cookie CONSENT

中变为 YES+cb.20210329-17-p2.de+FX+873

不幸的是,我们对 Invoke-WebRequest 的调用没有为我们提供任何预定义的形式 属性。 属性 (Invoke-WebRequest abc).Form 只是 NULL。

因此我们必须从响应内容中解析特定的表单数据,构建一个key=value body和POST body到action属性中提到的URL。

请在代码中找到其余过程作为注释。

代码

这是没有冗长输出的干净代码。在下面找到具有详细输出的相同代码。

$youtubeUrl    = 'https://www.youtube.com/user/YouTube/videos'
$consentDomain = 'consent.youtube.com'
$webUserAgent  = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.4976.0 Safari/537.36 Edg/102.0.1227.0'


# at first we disable the annoying (at least for this process) and in PS5.1- performance affecting progress bar for web requests
$currentProgressPreference = $ProgressPreference
$ProgressPreference        = [System.Management.Automation.ActionPreference]::SilentlyContinue

try {
    # in our first GET call we should get a response from consent.youtube.com.
    # we save the session including all cookies in variable $youtubeSession.
    $response = Invoke-WebRequest -Uri $youtubeUrl -UseBasicParsing -SessionVariable 'youtubeSession' -UserAgent $webUserAgent -ErrorAction Stop

    # using BaseResponse to figure out which host has responded
    if ($PSVersionTable.PSVersion.Major -gt 5) {
        # PS 6+ has other properties than PS5.1 and below
        $responseRequestUri = $response.BaseResponse.RequestMessage.RequestUri
    } else {
        $responseRequestUri = $response.BaseResponse.ResponseUri
    }

    if ($responseRequestUri.Host -eq $consentDomain) {
        # check if got redirected to "consent.youtube.com"

        # unfortunately the response object from "Invoke-WebRequest" does not provide any "Form" data as property,
        # so we have to parse it from the content. There are two <form..> nodes, but we only need the one for method "POST".
        $formContent = [regex]::Match(
            $response.Content,
            # we use lazy match, even if it's expensive when it comes to performance.
            ('{0}.+?(?:{1}.+?{2}|{2}.+?{1}).+?{3}' -f
                [regex]::Escape('<form'),
                [regex]::Escape("action=`"https://$consentDomain"),
                [regex]::Escape('method="POST"'),
                [regex]::Escape('</form>')
            )
        )

        # getting the POST URL using our parsed form data. As of now it should parse: "https://consent.youtube.com/s"
        $postUrl = [regex]::Match($formContent, '(?<=action\=\")[^\"]+(?=\")').Value

        # build POST body as hashtable using our parsed form data.
        # only elements with a "name" attribute are relevant and we only need the plain names and values
        $postBody = @{}
        [regex]::Matches($formContent -replace '\r?\n', '<input[^>]+>').Value | ForEach-Object {
            $name  = [regex]::Match($_, '(?<=name\=\")[^\"]+(?=\")').Value
            $value = [regex]::Match($_, '(?<=value\=\")[^\"]+(?=\")').Value

            if (![string]::IsNullOrWhiteSpace($name)) {
                $postBody[[string]$name] = [string]$value
            }
        }

        # now let's try to get an accepted CONSENT cookie by POSTing our hashtable to the parsed URL and override the sessionVariable again.
        # Using the previous session variable here would return a HTTP error 400 ("method not allowed")
        $response = Invoke-WebRequest -Uri $postUrl -Method Post -UseBasicParsing -SessionVariable 'youtubeSession' -UserAgent $webUserAgent -Body $postBody -ErrorAction Stop

        # get all the cookies for domain '.youtube.com'
        $youtubeCookies = [object[]]$youtubeSession.Cookies.GetCookies('https://youtube.com')

        # check if we got the relevant cookie "CONSENT" with a "yes+" prefix in its value
        # if the value changes in future, we have to adapt the condition here accordingly
        $consentCookie  = [object[]]($youtubeCookies | Where-Object { $_.Name -eq 'CONSENT' })
        if (!$consentCookie.Count) {
            Write-Error -Message 'The cookie "CONSENT" is missing in our session after our POST! Please check.' -ErrorAction Stop

        } elseif (!($consentCookie.Value -like 'YES+*').count) {
            Write-Error -Message ("The value of cookie ""CONSENT"" (""$($consentCookie.Value -join '" OR "')"") does not start with ""YES+"", but maybe it's intended and the condition has to be adapted!") -ErrorAction Stop
        }
    }

} finally {
    # set the progress preference back to the previous value
    $ProgressPreference = $currentProgressPreference
}

# From here on use the argument '-WebSession $youtubeSession' with each 'Invoke-WebRequest'
# e.g.:     Invoke-WebRequest $youtubeUrl -WebSession $youtubeSession -UseBasicParsing

与上面相同的代码,但具有详细的输出语句

过程与上面相同,但它包含详细的输出。 它只包含详细的输出语句,以便您或任何其他人可以在发生变化时更轻松地对其进行调试。

$youtubeUrl    = 'https://www.youtube.com/user/YouTube/videos'
$consentDomain = 'consent.youtube.com'
$webUserAgent  = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.4976.0 Safari/537.36 Edg/102.0.1227.0'


# remove this verbose preference definition or set it to "SilentlyContinue" to suppress verbose output
$currentVerbosePreference = $VerbosePreference
$VerbosePreference        = [System.Management.Automation.ActionPreference]::Continue


# at first we disable the annoying (at least for this process) and in PS5.1- performance affecting progress bar for web requests
$currentProgressPreference = $ProgressPreference
$ProgressPreference        = [System.Management.Automation.ActionPreference]::SilentlyContinue


#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#
#region THIS REGION CAN BE REMOVED.
#------------------------------------------------------------------------------------------------------------------------
Write-Verbose "`r`n>> Let's start with a GET to:`r`n`t$youtubeUrl"
#------------------------------------------------------------------------------------------------------------------------
#endregion
#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#


try {
    # in our first GET call we should get a response from consent.youtube.com.
    # we save the session including all cookies in variable $youtubeSession.
    $response = Invoke-WebRequest -Uri $youtubeUrl -UseBasicParsing -SessionVariable 'youtubeSession' -UserAgent $webUserAgent -ErrorAction Stop

    # using BaseResponse to figure out which host has responded
    if ($PSVersionTable.PSVersion.Major -gt 5) {
        # PS 6+ has other properties than PS5.1 and below
        $responseRequestUri = $response.BaseResponse.RequestMessage.RequestUri
    } else {
        $responseRequestUri = $response.BaseResponse.ResponseUri
    }


    #↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#
    #region THIS REGION CAN BE REMOVED.
    #------------------------------------------------------------------------------------------------------------------------
    Write-Verbose "`r`n>> We got a response from:`r`n`t$responseRequestUri"
    Write-Verbose "`r`n>> Let''s check our cookies. We should see a cookie 'CONSENT' which is pending:"
    Write-Verbose ($youtubeSession.Cookies.GetCookies('https://youtube.com') | Format-Table Domain, Name, Value | Out-String)
    #------------------------------------------------------------------------------------------------------------------------
    #endregion
    #↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#


    # check if got redirected to consent domain
    if ($responseRequestUri.Host -eq $consentDomain) {
        #↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#
        #region THIS REGION CAN BE REMOVED.
        #------------------------------------------------------------------------------------------------------------------------
        Write-Verbose "`r`n>> Let's parse the required form data and post it to the correct URL"
        #------------------------------------------------------------------------------------------------------------------------
        #endregion
        #↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#


        # unfortunately the response object from "Invoke-WebRequest" does not provide any "Form" data as property,
        # so we have to parse it from the content. There are two <form..> nodes, but we only need the one for method "POST".
        $formContent = [regex]::Match(
            $response.Content,
            # we use lazy match, even if it's expensive when it comes to performance.
            ('{0}.+?(?:{1}.+?{2}|{2}.+?{1}).+?{3}' -f
                [regex]::Escape('<form'),
                [regex]::Escape("action=`"https://$consentDomain"),
                [regex]::Escape('method="POST"'),
                [regex]::Escape('</form>')
            )
        )

        # getting the POST URL using our parsed form data. As of now it should parse: "https://consent.youtube.com/s"
        $postUrl = [regex]::Match($formContent, '(?<=action\=\")[^\"]+(?=\")').Value

        # build POST body as hashtable using our parsed form data.
        # only elements with a "name" attribute are relevant and we only need the plain names and values
        $postBody = @{}
        [regex]::Matches($formContent -replace '\r?\n', '<input[^>]+>').Value | ForEach-Object {
            $name  = [regex]::Match($_, '(?<=name\=\")[^\"]+(?=\")').Value
            $value = [regex]::Match($_, '(?<=value\=\")[^\"]+(?=\")').Value

            if (![string]::IsNullOrWhiteSpace($name)) {
                $postBody[[string]$name] = [string]$value
            }
        }


        #↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#
        #region THIS REGION CAN BE REMOVED.
        #------------------------------------------------------------------------------------------------------------------------
        Write-Verbose "`r`n>> Now let's post the following body to:`r`n`t$postUrl"
        Write-Verbose ($postBody | Out-String)
        #------------------------------------------------------------------------------------------------------------------------
        #endregion
        #↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#


        # now let's try to get an accepted CONSENT cookie by POSTing our hashtable to the parsed URL and override the sessionVariable again.
        # Using the previous session variable here would return a HTTP error 400 ("method not allowed")
        $response = Invoke-WebRequest -Uri $postUrl -Method Post -UseBasicParsing -SessionVariable 'youtubeSession' -UserAgent $webUserAgent -Body $postBody -ErrorAction Stop

        # get all the cookies for domain '.youtube.com'
        $youtubeCookies = [object[]]$youtubeSession.Cookies.GetCookies('https://youtube.com')

        # check if we got the relevant cookie "CONSENT" with a "yes+" prefix in its value
        # if the value changes in future, we have to adapt the condition here accordingly
        $consentCookie  = [object[]]($youtubeCookies | Where-Object { $_.Name -eq 'CONSENT' })
        if (!$consentCookie.Count) {
            Write-Error -Message 'The cookie "CONSENT" is missing in our session after our POST! Please check.' -ErrorAction Stop

        } elseif (!($consentCookie.Value -like 'YES+*').count) {
            Write-Error -Message ("The value of cookie ""CONSENT"" (""$($consentCookie.Value -join '" OR "')"") does not start with ""YES+"", but maybe it's intended and the condition has to be adapted!") -ErrorAction Stop
        }


        #↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#
        #region THIS REGION CAN BE REMOVED. Even the $responseRequestUri part. Just for Verbose output
        #------------------------------------------------------------------------------------------------------------------------
        # using BaseResponse to figure out which host has responded
        if ($PSVersionTable.PSVersion.Major -gt 5) {
            # PS 6+ has other properties than PS5.1 and below
            $responseRequestUri = $response.BaseResponse.RequestMessage.RequestUri
        } else {
            $responseRequestUri = $response.BaseResponse.ResponseUri
        }
        Write-Verbose "`r`n>> This time we got a response from:`r`n`t$responseRequestUri"
        #------------------------------------------------------------------------------------------------------------------------
        #endregion
        #↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#
    }



    #↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#↓#
    #region THIS REGION CAN BE REMOVED. JUST A TEST. Always use:   -WebSession $youtubeSession
    #------------------------------------------------------------------------------------------------------------------------
    Write-Verbose "`r`n>> Let's check our cookies again:"
    Write-Verbose ($youtubeSession.Cookies.GetCookies('https://youtube.com') | Format-Table Domain, Name, Value | Out-String)

    Write-Verbose "`r`n>> Let''s check a video from github using our new session variable.`r`n`tVideo: https://www.youtube.com/watch?v=w3jLJU7DT5"
    $test = Invoke-WebRequest 'https://www.youtube.com/watch?v=w3jLJU7DT5E' -UseBasicParsing -WebSession $youtubeSession

    Write-Verbose "`r`n>> And again, let''s check our cookies:"
    Write-Verbose ($youtubeSession.Cookies.GetCookies('https://youtube.com') | Format-Table Domain, Name, Value | Out-String)

    Write-Verbose "`r`n>> And our content. But please press [Enter] first."
    if ($VerbosePreference -eq [System.Management.Automation.ActionPreference]::Continue) {
        Pause
        $test.Content
    }
    #------------------------------------------------------------------------------------------------------------------------
    #endregion
    #↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#↑#


} finally {
    # set the progress preference back to the previous value
    $ProgressPreference = $currentProgressPreference

    # set the verbose preference back to the previous value, in case it was used in ths script
    # can be removed if not used
    if ($currentVerbosePreference) {
        $VerbosePreference = $currentVerbosePreference
    }
}

# From here on use the argument '-WebSession $youtubeSession' with each 'Invoke-WebRequest'
# e.g.:     Invoke-WebRequest $youtubeUrl -WebSession $youtubeSession -UseBasicParsing

终于

获得正确的 CONSENT cookie 后,您可以使用 Invoke-WebRequest 使用参数 -WebSession $youtubeSession
获取任何站点 例如

Invoke-WebRequest 'https://www.youtube.com/watch?v=w3jLJU7DT5E' -WebSession $youtubeSession -UseBasicParsing