libcurl C++:多路复用时检索 HTTP 响应代码

libcurl C++: retrieving HTTP response code when multiplexing

目标:

如果检测到某个 HTTP status code 例如 200,则仅打印多路复用响应的内容。这需要在回调函数收到响应时读取 header 并提取 HTTP 代码。

由于多路复用接收到的响应可以通过回调方法以任何顺序返回给程序,因此必须异步/以 non-blocking 方式查找此状态代码。

下面进一步看到的程序是基于 libcurl tutorial found here. A question containing information about HTTP multiplexing

问题:

该程序成功地异步发送请求、接收并打印响应,但我目前不确定在程序中的哪个位置使用 curl_easy_getinfo(curl, CURLINFO_RESPONSE_CODE, &response_code),或者这是否是 [=14] 的正确方法=] 程序寻找 HTTP status code.

当前输出:

程序打印到 stdout 完整的 header 和有效载荷,但是当使用上述 CURLINFO_RESPONSE_CODE 时,程序会同步运行(即每次来自 header 的任何信息接收到有效载荷,程序将打印响应代码(而只有当回调函数检测到它时才应该打印它dump(...))。

代码:

#include <stdlib.h>
#include <errno.h>  
#include <iostream>
#include <string>

/* somewhat unix-specific */
#include <sys/time.h>
#include <unistd.h>
 
/* curl stuff */
#include <curl/curl.h>
#include <curl/mprintf.h>
 
#ifndef CURLPIPE_MULTIPLEX
#define CURLPIPE_MULTIPLEX 0
#endif

struct transfer {
  CURL *easy;
  unsigned int num;
  FILE *out;
};

#define NUM_HANDLES 1000

static void dump(const char *text, int num, unsigned char *ptr, size_t size,
          char nohex)
{
    // Print the response
    for (int i = 0; i < size; i++) {
        std::cout << ptr[i];
    }
}
 
static int my_trace(CURL *handle, curl_infotype type,
             char *data, size_t size,
             void *userp)
{
  const char *text;
  struct transfer *t = (struct transfer *)userp;
  unsigned int num = t->num;
  (void)handle; /* prevent compiler warning */
 
  switch(type) {
  default: /* in case a new one is introduced to shock us */
    return 0;
 
  case CURLINFO_SSL_DATA_OUT:
    text = "=> Send SSL data";
    break;
  case CURLINFO_HEADER_IN:
    text = "<= Recv header";
    break;
  case CURLINFO_DATA_IN:
    text = "<= Recv data";
    break;
  case CURLINFO_SSL_DATA_IN:
    text = "<= Recv SSL data";
    break;
  }

  dump(text, num, (unsigned char *)data, size, 1);
  return 0;
}
 
static void setup(struct transfer *t, int num)
{
  char filename[128];
  CURL *hnd;
 
  hnd = t->easy = curl_easy_init();
 
  curl_msnprintf(filename, 128, "dl-%d", num);
 
  t->out = fopen(filename, "wb");
  if(!t->out) {
    std::cout << "ERROR: could not open file for writing" << std::endl;
    exit(1);
  }
 
  /* write to this file */
  curl_easy_setopt(hnd, CURLOPT_WRITEDATA, t->out);
 
  /* set the same URL */
  curl_easy_setopt(hnd, CURLOPT_URL, "https://sometesturl.xyz");
 
  /* please be verbose */
  curl_easy_setopt(hnd, CURLOPT_VERBOSE, 1L);
  curl_easy_setopt(hnd, CURLOPT_DEBUGFUNCTION, my_trace);
  curl_easy_setopt(hnd, CURLOPT_DEBUGDATA, t);
 
  /* HTTP/2 please */
  curl_easy_setopt(hnd, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_2_0);

  /* we use a self-signed test server, skip verification during debugging */
  curl_easy_setopt(hnd, CURLOPT_SSL_VERIFYPEER, 0L);
  curl_easy_setopt(hnd, CURLOPT_SSL_VERIFYHOST, 0L);
 
#if (CURLPIPE_MULTIPLEX > 0)
  /* wait for pipe connection to confirm */
  curl_easy_setopt(hnd, CURLOPT_PIPEWAIT, 1L);
#endif

}
 
/*
 * Download many transfers over HTTP/2, using the same connection!
 */
int main(int argc, char **argv)
{

  struct transfer trans[NUM_HANDLES];
  CURLM *multi_handle;
  int i;
  int still_running = 0; /* keep number of running handles */
  int num_transfers;
  if(argc > 1) {
    /* if given a number, do that many transfers */
    num_transfers = atoi(argv[1]);
    if((num_transfers < 1) || (num_transfers > NUM_HANDLES))
      num_transfers = 3; /* a suitable low default */
  }
  else
    num_transfers = 3; /* suitable default */
 
  /* init a multi stack */
  multi_handle = curl_multi_init();
 
  for(i = 0; i < num_transfers; i++) {
    setup(&trans[i], i);
 
    /* add the individual transfer */
    curl_multi_add_handle(multi_handle, trans[i].easy);
  }
 
  curl_multi_setopt(multi_handle, CURLMOPT_PIPELINING, CURLPIPE_MULTIPLEX);
 
  do {
    CURLMcode mc = curl_multi_perform(multi_handle, &still_running);
 
    if(still_running)
      /* wait for activity, timeout or "nothing" */
      mc = curl_multi_poll(multi_handle, NULL, 0, 1000, NULL);
 
    if(mc)
      break;

  } while(still_running);

  // This behaves in a blocking / synchronous manner...
  // Not sure if this is the correct place to extract the status code?
  for (int i = 0; i < num_transfers; i++) {
      long response_code; // test variable
      curl_easy_getinfo(trans[i].easy, CURLINFO_RESPONSE_CODE, &response_code);
  }

  for(i = 0; i < num_transfers; i++) {
    curl_multi_remove_handle(multi_handle, trans[i].easy);
    curl_easy_cleanup(trans[i].easy);
  }
 
  curl_multi_cleanup(multi_handle);  
  return 0;
}

总结题:

1. 当使用 multiplexing 时可以以任何顺序接收响应,如何使用 libcurl 提取 HTTP 状态代码以便可以异步找到它喜欢目前收到的回复吗?

将调用移至主循环内的 curl_easy_getinfo(CURLINFO_RESPONSE_CODE)。在循环内使用 curl_multi_info_read() 来检测每个请求何时完成,然后再检索其响应代码。例如:

do {
    CURLMcode mc = curl_multi_perform(multi_handle, &still_running);
 
    if (still_running) {
        /* wait for activity, timeout or "nothing" */
        mc = curl_multi_poll(multi_handle, NULL, 0, 1000, NULL);
    }
 
    if (mc)
        break;

    do {
        int queued;
        CURLMsg *msg = curl_multi_info_read(multi_handle, &queued);
        if ((msg) && (msg->msg == CURLMSG_DONE) && (msg->result == CURLE_OK)) {
            long response_code;
            curl_easy_getinfo(mg->easy_handle, CURLINFO_RESPONSE_CODE, &response_code);
            ...
        }
    }
    while (msg);
}
while (still_running);

根据需要使用 CURLOPT_HEADERFUNCTION/CURLOPT_WRITEFUNCTION 回调将响应 headers/payload 保存到您的 transfer 结构中,不要只是将其转储到 CURLOPT_DEBUGFUNCTION 回调中的标准输出。这样,只有当最终响应代码是您要查找的内容时,您才能 print/save 数据。