在 x264 中正确使用 `nalu_process` 回调

Proper use of `nalu_process` callback in x264

我希望利用 libx264 的低延迟编码机制,从而在单个 NAL 单元可用时立即调用用户提供的回调,而不必在开始处理之前等待整个帧被编码.

x264 文档说明了有关该设施的以下内容:

/* Optional low-level callback for low-latency encoding.  Called for each output NAL unit
 * immediately after the NAL unit is finished encoding.  This allows the calling application
 * to begin processing video data (e.g. by sending packets over a network) before the frame
 * is done encoding.
 *
 * This callback MUST do the following in order to work correctly:
 * 1) Have available an output buffer of at least size nal->i_payload*3/2 + 5 + 64.
 * 2) Call x264_nal_encode( h, dst, nal ), where dst is the output buffer.
 * After these steps, the content of nal is valid and can be used in the same way as if
 * the NAL unit were output by x264_encoder_encode.
 *
 * This does not need to be synchronous with the encoding process: the data pointed to
 * by nal (both before and after x264_nal_encode) will remain valid until the next
 * x264_encoder_encode call.  The callback must be re-entrant.
 *
 * This callback does not work with frame-based threads; threads must be disabled
 * or sliced-threads enabled.  This callback also does not work as one would expect
 * with HRD -- since the buffering period SEI cannot be calculated until the frame
 * is finished encoding, it will not be sent via this callback.
 *
 * Note also that the NALs are not necessarily returned in order when sliced threads is
 * enabled.  Accordingly, the variable i_first_mb and i_last_mb are available in
 * x264_nal_t to help the calling application reorder the slices if necessary.
 *
 * When this callback is enabled, x264_encoder_encode does not return valid NALs;
 * the calling application is expected to acquire all output NALs through the callback.
 *
 * It is generally sensible to combine this callback with a use of slice-max-mbs or
 * slice-max-size.
 *
 * The opaque pointer is the opaque pointer from the input frame associated with this
 * NAL unit. This helps distinguish between nalu_process calls from different sources,
 * e.g. if doing multiple encodes in one process.
 */
void (*nalu_process)( x264_t *h, x264_nal_t *nal, void *opaque );

这似乎很简单。但是,当我 运行 下面的伪代码时,我在标记的行上遇到了段错误。我试图对 x264_nal_encode 本身添加一些调试以了解哪里出了问题,但似乎是函数调用本身导致了段错误。我在这里错过了什么吗? (让我们忽略 assert 的使用可能使 cb 不可重入的事实——它只是用来向 reader 表明我的工作区缓冲区足够大。)

#include <assert.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <x264.h>

#define WS_SIZE 10000000
uint8_t * workspace;

void cb(x264_t * h, x264_nal_t * nal, void * opaque)
{
  assert((nal->i_payload*3)/2 + 5 + 64 < WS_SIZE);
  x264_nal_encode(h, workspace, nal); // Segfault here.
  // Removed: Process nal.
}

int main(int argc, char ** argv)
{
  uint8_t * fake_frame = malloc(1280*720*3);
  memset(fake_frame, 0, 1280*720*3);

  workspace = malloc(WS_SIZE);

  x264_param_t param;
  int status = x264_param_default_preset(&param, "ultrafast", "zerolatency");
  assert(status == 0);

  param.i_csp = X264_CSP_RGB;
  param.i_width = 1280;
  param.i_height = 720;
  param.i_threads = 1;
  param.i_lookahead_threads = 1;
  param.i_frame_total = 0;
  param.i_fps_num = 30;
  param.i_fps_den = 1;
  param.i_slice_max_size = 1024;
  param.b_annexb = 1;
  param.nalu_process = cb;

  status = x264_param_apply_profile(&param, "high444");
  assert(status == 0);

  x264_t * h = x264_encoder_open(&param);
  assert(h);

  x264_picture_t pic;
  status = x264_picture_alloc(&pic, param.i_csp, param.i_width, param.i_height);
  assert(pic.img.i_plane == 1);

  x264_picture_t pic_out;
  x264_nal_t * nal; // Not used. We process NALs in cb.
  int i_nal;

  for (int i = 0; i < 100; ++i)
  {
    pic.i_pts = i;
    pic.img.plane[0] = fake_frame;
    status = x264_encoder_encode(h, &nal, &i_nal, &pic, &pic_out);
  }

  x264_encoder_close(h);
  x264_picture_clean(&pic);
  free(workspace);
  free(fake_frame);
  return 0;
}

编辑:段错误发生在第一次 cb 调用 x264_nal_encode 时。如果我切换到不同的预设,在第一次回调发生之前对更多帧进行编码,那么在第一次回调之前对 x264_encoder_encode 进行了几次成功调用,因此发生了段错误。

在 IRC 上与 x264 开发人员讨论后,我看到的行为似乎实际上是 x264 中的一个错误。传递给回调的 x264_t * h 不正确。如果用正确的(从 x264_encoder_open 获得的那个)覆盖该句柄,则一切正常。

我确定 x264 git commit 71ed44c7312438fac7c5c5301e45522e57127db4 是第一个坏的。该错误记录为 this x264 issue.

为未来的读者更新:我相信这个问题已经在提交 544c61f082194728d0391fb280a6e138ba320a96 中得到解决。