python urllib2: socket.error: [Errno 10054] An existing connection was forcibly closed by the remote host

python urllib2: socket.error: [Errno 10054] An existing connection was forcibly closed by the remote host

URL https://www.zacks.com/ 在我的浏览器中工作,我也可以从 Go 访问它,为什么服务器关闭 Python 的连接?

我正在使用 python 2.7.15.

import urllib2

page = urllib2.urlopen('https://www.zacks.com/')

给出以下错误...

Traceback (most recent call last):
  File "test3.py", line 3, in <module>
    page = urllib2.urlopen('https://www.zacks.com/')
  File "C:\ProgramData\Anaconda2\lib\urllib2.py", line 154, in urlopen
    return opener.open(url, data, timeout)
  File "C:\ProgramData\Anaconda2\lib\urllib2.py", line 429, in open
    response = self._open(req, data)
  File "C:\ProgramData\Anaconda2\lib\urllib2.py", line 447, in _open
    '_open', req)
  File "C:\ProgramData\Anaconda2\lib\urllib2.py", line 407, in _call_chain
    result = func(*args)
  File "C:\ProgramData\Anaconda2\lib\urllib2.py", line 1241, in https_open
    context=self._context)
  File "C:\ProgramData\Anaconda2\lib\urllib2.py", line 1201, in do_open
    r = h.getresponse(buffering=True)
  File "C:\ProgramData\Anaconda2\lib\httplib.py", line 1121, in getresponse
    response.begin()
  File "C:\ProgramData\Anaconda2\lib\httplib.py", line 438, in begin
    version, status, reason = self._read_status()
  File "C:\ProgramData\Anaconda2\lib\httplib.py", line 394, in _read_status
    line = self.fp.readline(_MAXLINE + 1)
  File "C:\ProgramData\Anaconda2\lib\socket.py", line 480, in readline
    data = self._sock.recv(self._rbufsize)
  File "C:\ProgramData\Anaconda2\lib\ssl.py", line 772, in recv
    return self.read(buflen)
  File "C:\ProgramData\Anaconda2\lib\ssl.py", line 659, in read
    v = self._sslobj.read(len)
socket.error: [Errno 10054] An existing connection was forcibly closed by the remote host

运行正常的 Go 程序。

package main

import (
    "fmt"
    "net/http"
)

func main() {
    _, err := http.Get("https://www.zacks.com/")
    if err != nil {
        fmt.Printf("%s", err)
        return
    }
    fmt.Printf("success")
}

输出:

success

我认为服务器正在 header 中寻找 User-Agent 来验证请求。您可以在请求中添加它 header 并尝试:

import urllib2
req = urllib2.Request(request)
req.add_header('User-Agent','Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0')
resp = urllib2.urlopen(req)
content = resp.read()
print content