FFmpeg：是否可以运行逐块管道传输 I/O（过滤）PCM 音频数据？

Question

我正在开发一个名为 ffmpegio 的 Python FFmpeg 包装器，我想实现的一个功能是对原始视频和音频数据进行逐块过滤。数据块通过管道传输到 FFmpeg，Python 等待 FFmpeg 处理和管道返回可用的输出数据，冲洗并重复。我已将其用于视频馈送，但在处理 PCM 音频时遇到问题 I/O。在标准输入关闭之前，PCM 编码器或解码器似乎会阻塞。有什么办法可以解决这种行为吗？

这个问题与另一个问题 "FFmpeg blocking pipe until done?" 有关，但 none 的答案适用（我认为）

编辑 #1：（为清晰起见，删除了大量原文）

这里至少有 Python 个示例。

首先，这是load_setup()加载视频和音频数据的常用脚本：

def reader(stdout):
    print("reading stdout...")
    y = stdout.read(1)
    print(f"  stdout: read the first byte")
    try:
        stdout.read()
    except:
        pass

def logger(stderr):
    print("log stderr...")
    l = stderr.readline()
    print(f"  stderr: {l.decode('utf8')}")
    while True:
        try:
            l = stderr.readline()
        except:
            break

cmd, x = load_setup() # <- 2 cases: video & audio
nbytes = x.size * x.itemsize

p = sp.Popen(cmd, stdin=sp.PIPE, stdout=sp.PIPE, stderr=sp.PIPE)

rd = Thread(target=reader, args=(p.stdout, nbytes))
rd.start()
lg = Thread(target=logger, args=(p.stderr,))
lg.start()

try:
    print("written input data to buffer")
    p.stdin.write(x)
    print("written input data")

    sleep(1)
    print("slept 1 second, closing stdin")
finally:
    p.stdin.close()
    print("stdin closed")
    p.stdout.close()
    p.stderr.close()
    rd.join()
    lg.join()
    p.wait()

首先，rawvideo I/O 具有设置功能：

def load_setup():
    return (
        "ffmpeg -hide_banner -f rawvideo -s 100x100 -pix_fmt rgb24 -i - -vf 'transpose' -f rawvideo -s 100x100 -",
        np.ones((100, 100, 3), "u1"),
    )

产生以下输出：

reading stdout...
written input data to buffer
log stderr...
written input data
  stderr: Input #0, rawvideo, from 'pipe:':

  stdout: read the first byte
slept 1 second, closing stdin
stdin closed

注意 stderr: ... 和 stdout: ... 出现在 slept 1 second, closing stdin 之前。

现在，音频对应

def load_setup():
    return (
        "ffmpeg -hide_banner -f f64le -ar 8000 -ac 1 -i - -af lowpass -f f64le -ac 1 -",
        np.ones((16000, 1)),
    )

哪个returns

reading stdout...
written input data to bufferlog stderr...

written input data
slept 1 second, closing stdin
stdin closed
  stderr: Guessed Channel Layout for Input Stream #0.0 : mono

  stdout: read the first byte

这里，stderr 和 stdout 显示行都出现在 stdin closed 之后，表明 FFmpeg 仅在 stdin 管道关闭后输出过滤后的音频样本。这种行为对于不同数量的样本或额外的 stdin.write() 仍然存在。

因此，问题是音频 I/O 是否有任何变通方法可以使其表现得像视频 I/O。也就是说，在初始写入后立即输出一些内容。

我浏览了 FFmpeg 存储库上的 pcm.c，看来 PCM 编码器在我看来似乎是不正确的。所以我正在寻找一种解决方法，比使用 AVI 容器更简单。

编辑 #2：修改示例以仅读取第一个字节、使用不同的音频过滤器和更多音频样本

Answer 1

如果其他人好奇，我可以回答我自己的问题运行更长的实验。

（推测）PCM encoder/decoder（两者都使用 pcm_f32le）最初过度缓冲其输入，最大缓冲区大小似乎取决于采样率。它在 51200 - 52224 之间达到最大值。

input S/s	stderr output b/w samples
44100	51200-52224
32000	51200-52224
16000	32768-33792
8000	16384-17408

在 stderr 上发布配置日志后，输出闸门打开并最终稳定到每个输入样本的预期输出样本数。

这是一次重复写入1024个样本的日志。过滤器是 afade 所以我们期望相同数量的输出样本。 stdout 读取操作输出一次读取 1024 个字节到 reader 线程上的队列，主线程检索超时设置为 10 毫秒的块。

# (block#) - output (#read) samples (unaccounted output samples)
49 - output 3072 samples (49152)
50 - output 9216 samples (40960)
51 - output 10240 samples (31744)
52 - output 5120 samples (27648)
53 - output 5120 samples (23552)
54 - output 6144 samples (18432)
55 - output 5120 samples (14336)
56 - output 10240 samples (5120)
57 - output 4096 samples (2048)
58 - output 1024 samples (2048)
59 - output 1024 samples (2048)
60 - output 1024 samples (2048)
...
934 - output 1024 samples (2048)
935 - output 1024 samples (2048)
936 - output 1024 samples (1536)
937 - output 1024 samples (512)
end - output 512 samples (0)

因此，需要 9 个主线程周期来清空 built-up 缓冲区，然后过滤操作会在 2048 字节的缓冲区中稳定下来。有趣的是，缓冲区大小在 stdin 关闭之前的末尾缩小，这发生在 #937 和 end.

之间

@Rotem - 感谢您的交流，这让我能够制定正确的实验。

FFmpeg：是否可以运行逐块管道传输 I/O（过滤）PCM 音频数据？

FFmpeg: Is it possible to run block-by-block piped I/O (filtering) of PCM audio data?

python

ffmpeg

FFmpeg：是否可以 运行 逐块管道传输 I/O（过滤）PCM 音频数据？

FFmpeg: Is it possible to run block-by-block piped I/O (filtering) of PCM audio data?

python

ffmpeg

FFmpeg：是否可以运行逐块管道传输 I/O（过滤）PCM 音频数据？