如何发送原始 http header

how to send raw http header

出于某些原因,我想将原始 http header 发送到服务器,python requests 可以做到吗?比如httpheader这样,

GET http://baidu.com/ HTTP/1.1
Host: baidu.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Firefox/38.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive

我发现 twisted 可以做到这一点,但有点复杂。

你可以这样做:

import requests    

headers = {'Host': 'baidu.com',
           'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Firefox/38.0,'
           'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
           'Accept-Language': 'en-US,en;q=0.5',
           'Accept-Encoding': 'gzip, deflate',
           'Connection': 'keep-alive'}

requests.get('http://baidu.com/', headers=headers)

requests.request 方法(及其所有衍生方法,如 request.getrequest.head)可以传递一个 headers 参数。请参阅 request and for custom headers.

的文档

你可以像这样使用它

requests.get('http://baidu.com', headers={'Host':'baidu.com',
                                          'Accept-Encoding': 'gzip, deflate',
                                          ...})

使用twisted:

from twisted.internet import reactor
from twisted.web.client import Agent
from twisted.web.http_headers import Headers

agent = Agent(reactor)

d = agent.request(
    'GET',
    'http://baidu.com/',
    Headers({
            'User-Agent': ['Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Firefox/38.0'],
            'Accept': ['text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'],
            'Accept-Language': ['en-US,en;q=0.5'],
            'Accept-Encoding': ['gzip, deflate'],
            'Connection': ['keep-alive']
        }),
    None)

def Response(null):
    print('Response received')

def Shutdown(null):
    print('Shutting down the reactor now')
    reactor.stop()

d.addCallback(Response)     # exec Response() after request is rcvd
d.addBoth(Shutdown)         # shut down after response rcvd
reactor.run()

它更复杂(特别是如果你想 "do stuff" 响应),但是 twisted 如果你打算在 [=41= 中进行网络或并发编程,你应该知道 twisted ].希望这对你有帮助,如果没有,我希望它能帮助那些在 HTTP headers 和 twisted.

上苦苦挣扎的人

编辑 - 2016 年 3 月 7 日

使用treq:

from __future__ import print_function
from treq import get
from twisted.internet.task import react


def handleResponse(response):
    """ Callback Function

    Once the response is recived, display the information. 
    This is the part where I suspect people will have the most
    trouble wrapping their heads around since it's heavily 
    dependent on deferreds (ie. futures or promises).
    """
    print('Code: %s\n' % response.code)

    print('Simple print:')
    response.content().addCallback(print)       # simple way to print on py2 & py3

    text = response.text()                      # returns a deferred
    text.addCallback(displayText)               # the way you should be handling responses, ie. via callbacks

def displayText(text):
    """ Callback Function

    Simply display the text. You would usually do more useful
    things in this call back, such as maniuplating the response 
    text or setting the text to some global or otherwise accessible
    variable(s).
    """
    print('Deferred print:')
    print(text)

def main(reactor):
    """
    This is the main function which will execute a request using the 
    GET method. After getting the response, the response code and content
    will be displayed. Finally, the twisted reactor will stop (since 
    the react function is being used).
    """
    url = 'http://baidu.com/'
    header={
        'User-Agent': ['Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Firefox/38.0'],
        'Accept': ['text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'],
        'Accept-Language': ['en-US,en;q=0.5'],
        'Accept-Encoding': ['gzip, deflate'],
        'Connection': ['keep-alive']}

    d = get(url, headers=header)
    d.addCallback(handleResponse)
    return d


react(main)         # run the main function and display results

treq 包比直接使用 twisted 更容易使用,它共享 requests.

的许多功能和语法

参考资料