install.packages 使用随 conda 安装的 R 时无法使用代理

install.packages doesn't work with proxy when using R installed with conda

我正在 Linux 服务器 RHEL6 上工作,我安装了 anaconda。 我有以下设置

 conda-env version : 4.3.13
 conda-build version : 2.1.4
 python version : 2.7.13.final.0
 rpy2 : 2.8.5

我安装了 rpy2 以在 python

中使用 R
> R.home()
[1] "/anaconda2/envs/py27CCA/lib/R"
> R.version 
version.string R version 3.3.2 (2016-10-31) 

我按以下方式设置我的代理:

> Sys.getenv("https_proxy")
[1] "https://login:pwd@xxx.net:8080/"

但是下载R包不行

> options(internet.info = 0)
> install.packages("httr")

* error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol
....
Warning: unable to access index for repository https://stat.ethz.ch/CRAN/src/contrib:
  cannot download all files
Warning message:
package 'httr' is not available (for R version 3.3.2)

但是如果我使用完全相同的代理设置安装相同的独立 R 版本,它可以正常工作

> R.version 
version.string R version 3.3.2 (2016-10-31) 
> install.packages("httr")
...
** testing if installed package can be loaded
* DONE (httr)
Making 'packages.html' ... done
...

是什么造成了这个问题?我检查了openssl版本,我在2个环境中都有相同的版本! link 解释了此类代理问题的可能原因 link Whosebug discussion

如果我在 python

中执行此操作,我会遇到相同的问题和错误消息
>>> from rpy2.robjects.packages import importr
>>> utils = importr('utils')
>>> utils.install_packages('httr')

TL;DR:

而不是将 https_proxy 设置为...:[=​​32=]

https://login:pwd@xxx.net:8080/

...尝试将其设置为:

http://login:pwd@xxx.net:8080/

此外,这样做,如果有人嗅探您与代理服务器建立的初始连接的数据包,您将泄露您的凭据。进一步阅读以了解更多信息。


IMO,这个问题与康达无关。这是一个非常常见的错误,我发现它在互联网上非常普遍。

之所以会发生这种情况,是因为术语 "HTTPS Proxy" 存在混淆。

IIUC,这里是两个环境变量的意思:

http_proxy|HTTP_PROXY: The proxy server that you wish to use, for all your HTTP requests to the outside world.

https_proxy|HTTPS_PROXY: The proxy server that you wish to use, for all your HTTPS requests to the outside world.

http(s?)://proxy.mydomain.com:3128
 ^^^^^          ^^^^^         ^^^^   
   |              |             |    
scheme    proxy domain/IP   proxy port

现在,理想情况下,这些环境变量的值中指定的方案决定了客户端连接到代理服务器所应使用的协议。


让我们看一下 HTTPS 代理的定义。从 curl >= v7.53:

的手册页中窃取
An HTTPS proxy receives all transactions over an SSL/TLS connection.
Once a secure connection with the proxy is established, the user agent
uses the proxy as usual, including sending CONNECT requests to instruct
the proxy to establish a [usually secure] TCP tunnel with an origin
server. HTTPS proxies protect nearly all aspects of user-proxy
communications as opposed to HTTP proxies that receive all requests
(including CONNECT requests) in vulnerable clear text.

With HTTPS proxies, it is possible to have two concurrent _nested_
SSL/TLS sessions: the "outer" one between the user agent and the proxy
and the "inner" one between the user agent and the origin server
(through the proxy). This change adds supports for such nested sessions
as well.

让我们用例子来看看(curl >= v7.53):

在这里,我将使用不支持通过 SSL/TLS 进行客户端-代理连接的代理。

确保没有事先设置代理环境变量:

((curl-7_53_1))$ env | grep -i proxy
((curl-7_53_1))$ 

环境:http_proxy,outer_scheme:http,inner_scheme:http

((curl-7_53_1))$ http_proxy="http://proxy.mydomain.com:3128" ./src/curl -s -vvv http://whosebug.com -o /dev/null
* Rebuilt URL to: http://whosebug.com/
*   Trying 10.1.1.7...
* TCP_NODELAY set
* Connected to proxy.mydomain.com (10.1.1.7) port 3128 (#0)
> GET http://whosebug.com/ HTTP/1.1
> Host: whosebug.com
> User-Agent: curl/7.53.1-DEV
> Accept: */*
> Proxy-Connection: Keep-Alive
>
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Cache-Control: private
< Content-Type: text/html; charset=utf-8
< X-Frame-Options: SAMEORIGIN
< X-Request-Guid: 539728ee-a91d-4964-bc7e-1d21d91a6f1d
< Content-Length: 228257
< Accept-Ranges: bytes
< Date: Thu, 16 Mar 2017 05:19:31 GMT
< X-Served-By: cache-jfk8137-JFK
< X-Cache: MISS
< X-Cache-Hits: 0
< X-Timer: S1489641571.098286,VS0,VE7
< Vary: Fastly-SSL
< X-DNS-Prefetch-Control: off
< Set-Cookie: prov=b2e2dcb8-c5ff-21d9-5712-a0e012573aa6; domain=.whosebug.com; expires=Fri, 01-Jan-2055 00:00:00 GMT; path=/; HttpOnly
< X-Cache: MISS from proxy.mydomain.com
< X-Cache-Lookup: MISS from proxy.mydomain.com:3128
< Via: 1.1 varnish, 1.0 proxy.mydomain.com (squid)
* HTTP/1.0 connection set to keep alive!
< Connection: keep-alive
<
{ [2816 bytes data]
* Connection #0 to host proxy.mydomain.com left intact

环境:http_proxy,outer_scheme:https,inner_scheme:http

((curl-7_53_1))$ http_proxy="https://proxy.mydomain.com:3128" ./src/curl -s -vvv http://whosebug.com -o /dev/null
* Rebuilt URL to: http://whosebug.com/
*   Trying 10.1.1.7...
* TCP_NODELAY set
* Connected to proxy.mydomain.com (10.1.1.7) port 3128 (#0)
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: none
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
} [5 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol
* Closing connection 0

环境:https_proxy,outer_scheme:http,inner_scheme:https

((curl-7_53_1))$ https_proxy="http://proxy.mydomain.com:3128" ./src/curl -s -vvv https://whosebug.com -o /dev/null
* Rebuilt URL to: https://whosebug.com/
*   Trying 10.1.1.7...
* TCP_NODELAY set
* Connected to proxy.mydomain.com (10.1.1.7) port 3128 (#0)
* Establish HTTP proxy tunnel to whosebug.com:443
> CONNECT whosebug.com:443 HTTP/1.1
> Host: whosebug.com:443
> User-Agent: curl/7.53.1-DEV
> Proxy-Connection: Keep-Alive
>
< HTTP/1.0 200 Connection established
<
* Proxy replied OK to CONNECT request
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: none
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
} [5 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.2 (IN), TLS handshake, Server hello (2):
{ [108 bytes data]
* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [3044 bytes data]
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
{ [333 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [70 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
{ [1 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: C=US; ST=NY; L=New York; O=Stack Exchange, Inc.; CN=*.stackexchange.com
*  start date: May 21 00:00:00 2016 GMT
*  expire date: Aug 14 12:00:00 2019 GMT
*  subjectAltName: host "whosebug.com" matched cert's "whosebug.com"
*  issuer: C=US; O=DigiCert Inc; OU=www.digicert.com; CN=DigiCert SHA2 High Assurance Server CA
*  SSL certificate verify ok.
} [5 bytes data]
> GET / HTTP/1.1
> Host: whosebug.com
> User-Agent: curl/7.53.1-DEV
> Accept: */*
>
{ [5 bytes data]
< HTTP/1.1 200 OK
< Cache-Control: private
< Content-Type: text/html; charset=utf-8
< X-Frame-Options: SAMEORIGIN
< X-Request-Guid: 96f8fe3c-058b-479e-8ef2-db6d09f485d3
< Content-Length: 226580
< Accept-Ranges: bytes
< Date: Thu, 16 Mar 2017 05:20:39 GMT
< Via: 1.1 varnish
< Connection: keep-alive
< X-Served-By: cache-jfk8135-JFK
< X-Cache: MISS
< X-Cache-Hits: 0
< X-Timer: S1489641639.425108,VS0,VE9
< Vary: Fastly-SSL
< X-DNS-Prefetch-Control: off
< Set-Cookie: prov=f1a401f1-f1a0-5f09-66ca-9a792543ee82; domain=.whosebug.com; expires=Fri, 01-Jan-2055 00:00:00 GMT; path=/; HttpOnly
<
{ [2181 bytes data]
* Connection #0 to host proxy.mydomain.com left intact

环境:https_proxy,outer_scheme:https,inner_scheme:https

((curl-7_53_1))$ https_proxy="https://proxy.mydomain.com:3128" ./src/curl -s -vvv https://whosebug.com -o /dev/null
* Rebuilt URL to: https://whosebug.com/
*   Trying 10.1.1.7...
* TCP_NODELAY set
* Connected to proxy.mydomain.com (10.1.1.7) port 3128 (#0)
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: none
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
} [5 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol
* Closing connection 0

现在,我将展示支持通过 SSL/TLS 连接的代理的相同输出。对于 运行 本地 https 代理,我已经安装了 squid 版本 4.0.17。我通过在 /etc/hosts 中覆盖它,将 proxy.mydomain.com 指向本地主机。相关的鱿鱼配置行是:

https_port 3127 cert=/etc/squid/ssl_cert/myCA.pem

请注意,我现在没有使用任何明确指定的(复杂的?)模式 (sslbump/intercept/accel/tproxy)

我也已将证书添加到信任库:

sudo cp /etc/squid/ssl_cert/myCA.pem /etc/pki/ca-trust/source/anchors/mySquidCA.pem
sudo update-ca-trust

现在,进行真正的测试:

环境:http_proxy,outer_scheme:https,inner_scheme:http

/t/curl-curl-7_53_1 ❯❯❯ http_proxy=https://proxy.mydomain.com:3127 ./src/curl -s -vvv http://google.com -o /dev/null
* Rebuilt URL to: http://google.com/
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to proxy.mydomain.com (127.0.0.1) port 3127 (#0)
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
} [5 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.2 (IN), TLS handshake, Server hello (2):
{ [86 bytes data]
* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [1027 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [262 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
{ [1 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / AES256-GCM-SHA384
* Proxy certificate:
*  subject: C=IN; ST=SomeState; L=SomeLocation; O=Default Company Ltd; CN=proxy.mydomain.com; emailAddress=no-reply@gmail.com
*  start date: Mar 16 06:43:35 2017 GMT
*  expire date: Mar 16 06:43:35 2018 GMT
*  common name: proxy.mydomain.com (matched)
*  issuer: C=IN; ST=SomeState; L=SomeLocation; O=Default Company Ltd; CN=proxy.mydomain.com; emailAddress=no-reply@gmail.com
*  SSL certificate verify ok.
} [5 bytes data]
> GET http://google.com/ HTTP/1.1
> Host: google.com
> User-Agent: curl/7.53.1-DEV
> Accept: */*
> Proxy-Connection: Keep-Alive
> 
{ [5 bytes data]
< HTTP/1.1 302 Found
< Cache-Control: private
< Content-Type: text/html; charset=UTF-8
< Location: http://www.google.co.in/?gfe_rd=cr&ei=ejTKWLGzM-Ts8AepwJyQCg
< Content-Length: 261
< Date: Thu, 16 Mar 2017 06:45:14 GMT
< X-Cache: MISS from lenovo
< X-Cache-Lookup: MISS from lenovo:3128
< Via: 1.1 lenovo (squid/4.0.17)
< Connection: keep-alive
< 
{ [5 bytes data]
* Connection #0 to host proxy.mydomain.com left intact

环境:https_proxy,outer_scheme:https,inner_scheme:https

/t/curl-curl-7_53_1 ❯❯❯ https_proxy=https://proxy.mydomain.com:3127 ./src/curl -s -vvv https://google.com -o /dev/null
* Rebuilt URL to: https://google.com/
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to proxy.mydomain.com (127.0.0.1) port 3127 (#0)
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
} [5 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.2 (IN), TLS handshake, Server hello (2):
{ [86 bytes data]
* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [1027 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [262 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
{ [1 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / AES256-GCM-SHA384
* ALPN, server did not agree to a protocol
* Proxy certificate:
*  subject: C=IN; ST=SomeState; L=SomeLocation; O=Default Company Ltd; CN=proxy.mydomain.com; emailAddress=no-reply@gmail.com
*  start date: Mar 16 06:43:35 2017 GMT
*  expire date: Mar 16 06:43:35 2018 GMT
*  common name: proxy.mydomain.com (matched)
*  issuer: C=IN; ST=SomeState; L=SomeLocation; O=Default Company Ltd; CN=proxy.mydomain.com; emailAddress=no-reply@gmail.com
*  SSL certificate verify ok.
* Establish HTTP proxy tunnel to google.com:443
} [5 bytes data]
> CONNECT google.com:443 HTTP/1.1
> Host: google.com:443
> User-Agent: curl/7.53.1-DEV
> Proxy-Connection: Keep-Alive
> 
{ [5 bytes data]
< HTTP/1.1 200 Connection established
< 
* Proxy replied OK to CONNECT request
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
} [5 bytes data]
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
} [5 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.2 (IN), TLS handshake, Server hello (2):
{ [102 bytes data]
* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [3757 bytes data]
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
{ [148 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [70 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
{ [1 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / ECDHE-ECDSA-AES128-GCM-SHA256
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: C=US; ST=California; L=Mountain View; O=Google Inc; CN=*.google.com
*  start date: Mar  9 02:43:31 2017 GMT
*  expire date: Jun  1 02:20:00 2017 GMT
*  subjectAltName: host "google.com" matched cert's "google.com"
*  issuer: C=US; O=Google Inc; CN=Google Internet Authority G2
*  SSL certificate verify ok.
} [5 bytes data]
> GET / HTTP/1.1
> Host: google.com
> User-Agent: curl/7.53.1-DEV
> Accept: */*
> 
{ [5 bytes data]
< HTTP/1.1 302 Found
< Cache-Control: private
< Content-Type: text/html; charset=UTF-8
< Location: https://www.google.co.in/?gfe_rd=cr&ei=hDTKWJXlMubs8Aek-6WQAg
< Content-Length: 262
< Date: Thu, 16 Mar 2017 06:45:24 GMT
< Alt-Svc: quic=":443"; ma=2592000; v="36,35,34"
< 
{ [262 bytes data]
* Connection #0 to host proxy.mydomain.com left intact

从输出中可以明显看出,在这两种情况下都首先与代理服务器进行了 SSL 握手。


下面我来吐槽一下

许多客户端(例如:curl = 7.51.0)不支持 SSL/TLS 与代理本身的连接并抛出此类错误:

$ https_proxy=https://proxy.mydomain.com:3128 curl -vvvv https://google.com
* Rebuilt URL to: https://google.com/
* Unsupported proxy scheme for 'https://proxy.mydomain.com:3128'
* Closing connection -1
curl: (7) Unsupported proxy scheme for 'https://proxy.mydomain.com:3128'

然后,有些客户端(例如 curl=7.47.0)会忽略代理中不受支持的方案 URL,这会误导人们相信他们所取得的成就。通常,他们永远不会通过 SSL/TLS 连接到代理服务器,即使变量明确指定方案为 'https' 并回退到使用与代理服务器的未加密连接。

然后还有其他客户端(例如 wget v1.18),这会让我们更加困惑:

  • 在下面的情况下,错误消息具有误导性,因为 即使是对 HTTP 请求,scheme 也可以保存值 https:// 外部世界(如上例所示,使用squid),因为我们希望与代理服务器的连接结束SSL/TLS。

    http_proxy=https://proxy.mydomain.com:3128 wget http://google.com
    Error in proxy URL https://proxy.mydomain.com:3128: Must be HTTP.
    
  • 不仅如此,混乱度增加,回落时,使 我们认为它可能正在连接到代理服务器 SSL/TLS,但实际上并非如此,而且还让我们认为 方案中的 https:// 应该仅在内部协议为 还有 https://

    https_proxy=https://proxy.mydomain-research.com:3128 wget https://google.com
    --2017-03-16 11:21:06--  https://google.com/
    Resolving proxy.mydomain-research.com (proxy.mydomain-research.com)... 10.1.1.7
    Connecting to proxy.mydomain-research.com (proxy.mydomain-research.com)|10.1.1.7|:3128... connected.
    Proxy request sent, awaiting response... 301 Moved Permanently
    Location: https://www.google.com/ [following]
    --2017-03-16 11:21:07--  https://www.google.com/
    Connecting to proxy.mydomain-research.com (proxy.mydomain-research.com)|10.1.1.7|:3128... connected.
    Proxy request sent, awaiting response... 200 OK
    Length: unspecified [text/html]
    Saving to: ‘index.html’
    

如需阅读有关通过 TLS/SSL 连接(和不连接)代理服务器的安全方面的更多信息,请访问:https://security.stackexchange.com/a/61336/114965