请求编码设置失败
Requests encoding settings fail
我有python代码
from requests import get
for x in xrange(0, 200, 50):
url = "http://pornolab.net/forum/viewforum.php?f=1765&start={}".format(x)
print "Get '{}'".format(url)
r = get(url)
print "Encoding: {}".format(r.encoding)
问题是我在同一标题上有编码:windows-1251,有时还有编码:ISO-8859-1。结果
第一次开始
Get http://pornolab.net/forum/viewforum.php?f=1765&start=50
Encoding: windows-1251
第二次开始
Get http://pornolab.net/forum/viewforum.php?f=1765&start=50
Encoding: ISO-8859-1
为什么会这样?如何在请求中设置编码设置?
When you make a request, Requests makes educated guesses about the encoding of the response based on the HTTP headers. The text encoding guessed by Requests is used when you access r.text. You can find out what encoding Requests is using, and change it, using the r.encoding property:
r.encoding
'utf-8'
r.encoding = 'ISO-8859-1'
If you change the encoding, Requests will use the new value of r.encoding whenever you call r.text. You might want to do this in any situation where you can apply special logic to work out what the encoding of the content will be. For example, HTTP and XML have the ability to specify their encoding in their body. In situations like this, you should use r.content to find the encoding, and then set r.encoding. This will let you use r.text with the correct encoding.
我有python代码
from requests import get
for x in xrange(0, 200, 50):
url = "http://pornolab.net/forum/viewforum.php?f=1765&start={}".format(x)
print "Get '{}'".format(url)
r = get(url)
print "Encoding: {}".format(r.encoding)
问题是我在同一标题上有编码:windows-1251,有时还有编码:ISO-8859-1。结果
第一次开始
Get http://pornolab.net/forum/viewforum.php?f=1765&start=50
Encoding: windows-1251
第二次开始
Get http://pornolab.net/forum/viewforum.php?f=1765&start=50
Encoding: ISO-8859-1
为什么会这样?如何在请求中设置编码设置?
When you make a request, Requests makes educated guesses about the encoding of the response based on the HTTP headers. The text encoding guessed by Requests is used when you access r.text. You can find out what encoding Requests is using, and change it, using the r.encoding property:
r.encoding
'utf-8'
r.encoding = 'ISO-8859-1'
If you change the encoding, Requests will use the new value of r.encoding whenever you call r.text. You might want to do this in any situation where you can apply special logic to work out what the encoding of the content will be. For example, HTTP and XML have the ability to specify their encoding in their body. In situations like this, you should use r.content to find the encoding, and then set r.encoding. This will let you use r.text with the correct encoding.