Python urllib2 返回一个空字符串
Python urllib2 returning an empty string
我正在尝试检索以下 URL:http://www.winkworth.co.uk/sale/property/flat-for-sale-in-masefield-court-london-n5/HIH140004。
import urllib2
response = urllib2.urlopen('http://www.winkworth.co.uk/rent/property/terraced-house-to-rent-in-mill-road--/WOT140129')
response.read()
但是我得到的是一个空字符串。当我通过浏览器或使用 cURL 尝试时,它工作正常。有什么想法吗?
我在使用 requests
库时得到了响应,但在使用 urllib2
时没有得到响应,所以我尝试了 HTTP 请求 headers.
事实证明,服务器需要 Accept
header; urllib2
不发送,requests
和 cURL 发送 */*
.
也用 urllib2
发送一个:
url = 'http://www.winkworth.co.uk/sale/property/flat-for-sale-in-masefield-court-london-n5/HIH140004'
req = urllib2.Request(url, headers={'accept': '*/*'})
response = urllib2.urlopen(req)
演示:
>>> import urllib2
>>> url = 'http://www.winkworth.co.uk/sale/property/flat-for-sale-in-masefield-court-london-n5/HIH140004'
>>> len(urllib2.urlopen(url).read())
0
>>> request = urllib2.Request(url, headers={'accept': '*/*'})
>>> len(urllib2.urlopen(request).read())
37197
这里是服务器问题; RFC 2616 状态:
If no Accept header field is present, then it is assumed that the
client accepts all media types.
我正在尝试检索以下 URL:http://www.winkworth.co.uk/sale/property/flat-for-sale-in-masefield-court-london-n5/HIH140004。
import urllib2
response = urllib2.urlopen('http://www.winkworth.co.uk/rent/property/terraced-house-to-rent-in-mill-road--/WOT140129')
response.read()
但是我得到的是一个空字符串。当我通过浏览器或使用 cURL 尝试时,它工作正常。有什么想法吗?
我在使用 requests
库时得到了响应,但在使用 urllib2
时没有得到响应,所以我尝试了 HTTP 请求 headers.
事实证明,服务器需要 Accept
header; urllib2
不发送,requests
和 cURL 发送 */*
.
也用 urllib2
发送一个:
url = 'http://www.winkworth.co.uk/sale/property/flat-for-sale-in-masefield-court-london-n5/HIH140004'
req = urllib2.Request(url, headers={'accept': '*/*'})
response = urllib2.urlopen(req)
演示:
>>> import urllib2
>>> url = 'http://www.winkworth.co.uk/sale/property/flat-for-sale-in-masefield-court-london-n5/HIH140004'
>>> len(urllib2.urlopen(url).read())
0
>>> request = urllib2.Request(url, headers={'accept': '*/*'})
>>> len(urllib2.urlopen(request).read())
37197
这里是服务器问题; RFC 2616 状态:
If no Accept header field is present, then it is assumed that the client accepts all media types.