无法使用 curl 登录 usgs.gov 来存储会话 cookie

Can not sign on to usgs.gov using curl to store session-cookies

我正在尝试使用 curl 登录 ers.cr.usgs.gov。在过去的几天里,我一直在尝试很多不同的方法,但我似乎永远无法获得响应发送的会话 cookie 的第二部分。我已经尝试通过 Firefox 开发人员工具直接将 post 保存为 curl 请求,并将 csrf_token__ncforminfo 替换为正确的值,但仍然没有。我将在下面包含 curl 命令、日志输出和一些成功登录的屏幕截图。 任何帮助将不胜感激。谢谢。

    curl -o temp.html "$LOGIN"                                                                                                                                                                                                                                                                                                                        
    TOKEN=$(sed -n 's/.*name="csrf_token"\s\+value="\([^"]\+\).*//p' temp.html)                                                                                                                                                                                                                                                     
    ENCODED_TOKEN=$(urlencode $TOKEN)                                                                                                                                                                                                                                                                                       
    FORMINFO=$(sed -n 's/.*name="__ncforminfo"\s\+value="\([^"]\+\).*//p' temp.html)                                                                                                                                                                                                                                              
    ENCODED_FORMINFO=$(urlencode $FORMINFO) 
    curl --verbose \
          --cookie-jar cookies.txt \
          -H "Host: ers.cr.usgs.gov" \
          -H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0" \
          -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8" \
          -H "Accept-Language: en-US,en;q=0.5" \
          -H "Accept-Encoding: gzip, deflate, br" \
          -H "Content-Type: application/x-www-form-urlencoded" \
          -H "Origin: https://ers.cr.usgs.gov" \
          -H "Connection: keep-alive" \
          -H "Referer: https://ers.cr.usgs.gov/login/" \
          -H "Upgrade-Insecure-Requests: 1" \
          --data "username=$USERNAME&password=$PASSWORD&csrf_token=$ENCODED_TOKEN&__ncforminfo=$ENCODED_FORMINFO" https://ers.cr.usgs.gov/login/
*   Trying 2001:49c8:4000:122c::7...                                                                                                                                        
* TCP_NODELAY set                                                                                                                                                           
*   Trying 152.61.136.7...                                                                                                                                                  
* TCP_NODELAY set                                                                                                                                                           
* Connected to ers.cr.usgs.gov (152.61.136.7) port 443 (#0)                                                                                                                 
* ALPN, offering h2                                                                                                                                                         
* ALPN, offering http/1.1                                                                                                                                                   
* successfully set certificate verify locations:                                                                                                                            
*   CAfile: /etc/ssl/certs/ca-certificates.crt                                                                                                                              
  CApath: /etc/ssl/certs                                                                                                                                                    
* TLSv1.3 (OUT), TLS handshake, Client hello (1):                                                                                                                           
* TLSv1.3 (IN), TLS handshake, Server hello (2):                                                                                                                            
* TLSv1.2 (IN), TLS handshake, Certificate (11):                                                                                                                            
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):                                                                                                                    
* TLSv1.2 (IN), TLS handshake, Server finished (14):                                                                                                                        
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):                                                                                                                   
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):                                                                                                                       
* TLSv1.2 (OUT), TLS handshake, Finished (20):                                                                                                                              
* TLSv1.2 (IN), TLS handshake, Finished (20):                                                                                                                               
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384                                                                                                                
* ALPN, server did not agree to a protocol                                                                                                                                  
* Server certificate:                                                                                                                                                       
*  subject: C=US; ST=Virginia; L=Reston; O=U.S. Geological Survey; OU=USGS; CN=*.cr.usgs.gov                                                                                
*  start date: Apr  5 00:00:00 2019 GMT
*  expire date: Jun 10 12:00:00 2020 GMT
*  subjectAltName: host "ers.cr.usgs.gov" matched cert's "*.cr.usgs.gov"
*  issuer: C=US; O=DigiCert Inc; OU=www.digicert.com; CN=DigiCert SHA2 High Assurance Server CA
*  SSL certificate verify ok.
> POST /login/ HTTP/1.1
> Host: ers.cr.usgs.gov
> User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0
> Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
> Accept-Language: en-US,en;q=0.5
> Accept-Encoding: gzip, deflate, br
> Content-Type: application/x-www-form-urlencoded
> Origin: https://ers.cr.usgs.gov
> Connection: keep-alive
> Referer: https://ers.cr.usgs.gov/login/
> Upgrade-Insecure-Requests: 1
> Content-Length: 196
>
* upload completely sent off: 196 out of 196 bytes
< HTTP/1.1 403 Forbidden
< Date: Tue, 05 Nov 2019 15:41:14 GMT
< X-Frame-Options: SAMEORIGIN
< Strict-Transport-Security: max-age=31536000; includeSubDomains
* cookie size: name/val 9 + 26 bytes
* cookie size: name/val 4 + 1 bytes
* cookie size: name/val 6 + 0 bytes
* cookie size: name/val 8 + 0 bytes
* Added cookie PHPSESSID="8rd75nf2ugv0ono7djbh29v30l" for domain ers.cr.usgs.gov, path /, expire 0
< Set-Cookie: PHPSESSID=8rd75nf2ugv0ono7djbh29v30l; path=/; secure; HttpOnly
< Expires: Thu, 19 Nov 1981 08:52:00 GMT
< Cache-Control: no-store, no-cache, must-revalidate
< Pragma: no-cache
* cookie size: name/val 9 + 26 bytes
* cookie size: name/val 4 + 1 bytes
* cookie size: name/val 6 + 0 bytes
* cookie size: name/val 8 + 0 bytes
* Added cookie PHPSESSID="8rd75nf2ugv0ono7djbh29v30l" for domain ers.cr.usgs.gov, path /, expire 0
< Set-Cookie: PHPSESSID=8rd75nf2ugv0ono7djbh29v30l; path=/; secure; HttpOnly
< Expires: Thu, 19 Nov 1981 08:52:00 GMT
< Cache-Control: no-store, no-cache, must-revalidate
< Pragma: no-cache
* cookie size: name/val 9 + 26 bytes
* cookie size: name/val 4 + 1 bytes
* cookie size: name/val 6 + 0 bytes
* cookie size: name/val 8 + 0 bytes
* Replaced cookie PHPSESSID="54vnism9of8r7n9da2pqn8rjla" for domain ers.cr.usgs.gov, path /, expire 0
< Set-Cookie: PHPSESSID=54vnism9of8r7n9da2pqn8rjla; path=/; secure; HttpOnly
< Content-Length: 0
< Keep-Alive: timeout=15, max=50
< Connection: Keep-Alive
< Content-Type: text/html; charset=UTF-8
< Strict-Transport-Security:  max-age=31536000
<
* Connection #0 to host ers.cr.usgs.gov left intact
# Netscape HTTP Cookie File
# https://curl.haxx.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.

#HttpOnly_ers.cr.usgs.gov       FALSE   /       TRUE    0       PHPSESSID       blahblahblahhereissessionstuff

我在这个问题上找到了自己的答案。当我第一次拉取页面以获取 csrf_token__ncforminfo 时,页面正在设置一个 PHPSESSID cookie,SSO 也需要存在该 cookie 才能发回正确的信息。如下更改我的两个 curl 请求,允许 POST 成功。

curl --cookie-jar cookies.txt -o temp.html "$LOGIN"

curl \
--cookie cookies.txt \
--cookie-jar cookies.txt \
...
--data "username=$USERNAME&password=$PASSWORD&csrf_token=$ENCODED_TOKEN&__ncforminfo=$ENCODED_FORMINFO" https://ers.cr.usgs.gov/login/