使用 PulseAudio 在流之间传输音频时延迟增加

Increasing lag when transferring audio between streams with PulseAudio

我一直在使用 PulseAudio 库用 C++ 开发一个个人项目,我注意到一些奇怪的行为,我不确定是什么原因造成的。

到目前为止我的设置相当简单:

此设置确实有效(我可以很好地听到音频),但我注意到随着时间的推移,缓冲区大小似乎在略有增加,因此也增加延迟,最终导致明显的音频“滞后”。

这个问题可以用一些相当简单的代码重现(忽略程序中为缓冲区分配的内存量不断增加的事实,我只担心 buffer_length 增加):

#include <iostream>
#include <pulse/pulseaudio.h>
#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
#include <unistd.h>
#include <cstring>

void contextStateChanged(pa_context *ctx, void *userdata);
void sinkCreated(pa_context *context, uint32_t idx, void *userdata);
void writeToStream(pa_stream *stream, size_t nbytes, void *userdata);
void readFromStream(pa_stream *stream, size_t nbytes, void *userdata);
void streamStateChanged(pa_stream *p, void *userdata);

pa_context *context;

void *buffer;
size_t buffer_index, buffer_length;

int32_t bytesRead = 0;
int32_t bytesWritten = 0;

pa_mainloop *mainloop;

pa_sample_spec spec = {
    .format = PA_SAMPLE_S16BE,
    .rate = 48000,
    .channels = 2
};

int main(int argc, char **argv) {
    mainloop = pa_mainloop_new();
    assert(mainloop);

    pa_mainloop_api *mainloopAPI = pa_mainloop_get_api(mainloop);
    assert(mainloopAPI);

    pa_proplist *props = pa_proplist_new();
    pa_proplist_sets(props, PA_PROP_APPLICATION_NAME, "PulseTest");
    pa_proplist_sets(props, PA_PROP_APPLICATION_ID, "me.mrletsplay.pulsetest");
    pa_proplist_sets(props, PA_PROP_APPLICATION_VERSION, "1.0");
    pa_proplist_sets(props, PA_PROP_APPLICATION_ICON_NAME, "audio-card");

    context = pa_context_new_with_proplist(mainloopAPI, "PulseTest", props);
    assert(context);

    pa_context_set_state_callback(context, contextStateChanged, NULL);

    pa_context_connect(context, NULL, (pa_context_flags_t) 0, NULL);

    std::cout << "Waiting for Pulseaudio" << std::endl;

    return pa_mainloop_run(mainloop, 0);
}

void initStreams() {
    pa_stream *stream = pa_stream_new(context, "playback", &spec, NULL);

    pa_buffer_attr bufferAttr;
    bufferAttr.maxlength = (uint32_t) 4096;
    bufferAttr.tlength = (uint32_t) 256;
    bufferAttr.prebuf = (uint32_t) -1;
    bufferAttr.minreq = (uint32_t) 64;
    assert(pa_stream_connect_playback(stream, NULL, &bufferAttr, PA_STREAM_ADJUST_LATENCY, NULL, NULL) == 0);
    pa_stream_set_state_callback(stream, streamStateChanged, NULL);
    pa_stream_set_write_callback(stream, writeToStream, NULL);

    pa_stream *in = pa_stream_new(context, "record", &spec, NULL);

    pa_buffer_attr inBuffer;
    inBuffer.maxlength = (uint32_t) 1024;
    inBuffer.fragsize = (uint32_t) 512;
    assert(pa_stream_connect_record(in, NULL, &inBuffer, PA_STREAM_ADJUST_LATENCY) == 0);
    pa_stream_set_state_callback(in, streamStateChanged, NULL);
    pa_stream_set_read_callback(in, readFromStream, NULL);
}

void contextStateChanged(pa_context *ctx, void *userdata) {
    if(pa_context_get_state(ctx) == PA_CONTEXT_READY) {
        std::cout << "Connected to Pulseaudio" << std::endl;
        initStreams();
    }
}

void writeToStream(pa_stream *stream, size_t nbytes, void *userdata) {
    bytesWritten += nbytes;

    // Output the difference between how many bytes we've read and how many bytes we've written
    std::cout << (bytesRead - bytesWritten) << std::endl;

    size_t write = nbytes;
    if(write > buffer_length) {
        write = buffer_length;
    }

    void *data;
    if(pa_stream_begin_write(stream, &data, &nbytes) < 0) {
        std::cout << "ERROR writing data: " << pa_strerror(pa_context_errno(context)) << std::endl;
        exit(1);
        return;
    }

    memcpy(data, (uint8_t *) buffer + buffer_index, write);
    buffer_length -= write;
    buffer_index += write;

    if(pa_stream_write(stream, data, nbytes, NULL, 0, PA_SEEK_RELATIVE) < 0) {
        std::cout << "ERROR writing data: " << pa_strerror(pa_context_errno(context)) << std::endl;
        exit(1);
        return;
    }
}

void readFromStream(pa_stream *stream, size_t nbytes, void *userdata) {
    bytesRead += nbytes;

    const void *data;
    if(pa_stream_peek(stream, &data, &nbytes) < 0) {
        std::cout << "ERROR reading data: " << pa_strerror(pa_context_errno(context)) << std::endl;
        exit(1);
        return;
    }

    if(buffer) {
        buffer = pa_xrealloc(buffer, buffer_index + buffer_length + nbytes);
        memcpy((uint8_t *) buffer + buffer_index + buffer_length, data, nbytes);
        buffer_length += nbytes;
    }else {
        buffer = pa_xmalloc(nbytes);
        memcpy(buffer, data, nbytes);
        buffer_length = nbytes;
        buffer_index = 0;
    }

    pa_stream_drop(stream);
}

void streamStateChanged(pa_stream *p, void *userdata) {
    std::cout << "State changed for stream: " << pa_stream_get_state(p) << std::endl;

    if(pa_stream_get_state(p) == PA_STREAM_READY) {
        std::cout << "Stream is ready" << std::endl;
    }
}

在代码中,我跟踪读取了多少字节以及写入了多少字节。 bytesRead 似乎比 bytesWritten 增长得更多,导致缓冲区随时间增长。

我尝试写入比 PulseAudio 请求更多的字节,但这似乎只会导致 PulseAudio 挂起并且根本无法播放任何音频。

在大约 10 分钟的程序输出生成的这张图表中,您可以很容易地看出问题所在: Program output chart

同时输入和输出相同的数据是声音应用程序(脉冲音频、Direct Sound 等)的常见问题。 滞后的原因可能来自列表:

  • 不同的输入和输出设备;
  • USB 声音设备;
  • 很少滞后于您的处理线程; ...等等

问题是缺少输出数据和声音设备(很少)必须添加crack/silence/anysound。

常见的解决方法是:

  1. 读取数据量已输出(输出设备读取播放位置可获取)
  2. 计算您的数据量(发送加上缓冲区中的可用数据)
  3. 控制“尾巴”您发送了多少声音数据但它们尚未播放。
  4. 而且你必须根据“tail”输入声音数据输出:如果tail很短——添加更多数据;如果尾巴很长 - 删除一些数据。

当然,您应该(有时)生成假声音数据并(有时)删除额外数据。

尾巴的合理长度约为 100 - 300 毫秒。