为什么 Python 的请求比 C 的 libcurl 快 10 倍?

Why is Python's requests 10x faster than C's libcurl?

Python 的 requests 库似乎比 C 的 libcurl(C API、CLI 应用程序和 Python API) 对于 1.6 MB 的请求(requests 需要 800ms,而 curl/libcurl 有时需要 7秒).

libcurl 似乎以 16KB 块的形式获得回复,而请求似乎一次获得全部内容,但我不确定是这样......我试过 curl_easy_setopt(curl_get, CURLOPT_BUFFERSIZE, 1<<19)但让缓冲区大小 更小 .

似乎才好

我已经尝试查看 source coderequests,我 认为 它使用 urllib3 作为其 HTTP“后端” ...但是直接使用 urllib3 会导致与使用 curl.

相同(令人失望)的结果

这里有一些例子。

/*
gcc-8 test.c -o test -lcurl  &&  t ./test
*/
#include <curl/curl.h>

int main(){
  CURLcode curl_st;
  curl_global_init(CURL_GLOBAL_ALL);

  CURL* curl_get = curl_easy_init();
  curl_easy_setopt(curl_get, CURLOPT_URL,           "https://api.binance.com/api/v3/exchangeInfo");
  curl_easy_setopt(curl_get, CURLOPT_BUFFERSIZE, 1<<19);
  curl_st=curl_easy_perform(curl_get);  if(curl_st!=CURLE_OK) printf("\x1b[91mFAIL  \x1b[37m%s\x1b[0m\n", curl_easy_strerror(curl_st));

  curl_easy_cleanup(curl_get);
  curl_global_cleanup();
}
'''FAST'''
import requests
reply = requests.get('https://api.binance.com/api/v3/exchangeInfo')
print(reply.text)
'''SLOW'''
import urllib3
pool = urllib3.PoolManager()  # conn = pool.connection_from_url('https://api.binance.com/api/v3/exchangeInfo')
reply = pool.request('GET', 'https://api.binance.com/api/v3/exchangeInfo')
print(reply.data)
print(len(reply.data))
'''SLOW!'''
import urllib.request
with urllib.request.urlopen('https://api.binance.com/api/v3/exchangeInfo') as response:
  html = response.read()
'''SLOW!'''
import pycurl
from io import BytesIO
buf  = BytesIO()
curl = pycurl.Curl()
curl.setopt(curl.URL, 'https://api.binance.com/api/v3/exchangeInfo')
curl.setopt(curl.WRITEDATA, buf)
curl.perform()
curl.close()
body = buf.getvalue()  # Body is a byte string. We have to know the encoding in order to print it to a text file such as standard output.
print(body.decode('iso-8859-1'))
curl https://api.binance.com/api/v3/exchangeInfo

加快 Web 内容传输的一种方法是使用 HTTP compression。这是通过在服务器和客户端之间发送数据之前动态压缩数据来实现的,因此传输时间更短。

虽然HTTP compression is supported by libcurl,默认是禁用的:。来自 CURLOPT_ACCEPT_ENCODING 文档:

Set CURLOPT_ACCEPT_ENCODING to NULL to explicitly disable it, which makes libcurl not send an Accept-Encoding: header and not decompress received contents automatically.

这个参数的默认值为NULL,所以除非你特别启用HTTP压缩,否则你不会得到它。