当我将带有 ffmpeg 的 numpy 数组转换为音频文件 (python) 时，为什么 mp3/wav 持续时间不同？

Question

我想将一个应包含 60 秒原始音频的 numpy 数组转换为 .wav 和 .mp3 文件。使用 ffmpeg（版本 3.4.6），我尝试将数组转换为所需的格式。为了比较，我还使用了 modul 声音文件。只有 soundfile 创建的 .wav 文件具有准确的 60 秒的预期长度。 ffmpeg 创建的 .wav 文件稍短，而 .mp3 文件是 ca。 32 秒长。

我希望所有导出都相同length.What我做错了吗？

这是一个示例代码：

import subprocess as sp
import numpy as np
import soundfile as sf

def data2audiofile(filename,data):
    out_cmds = ['ffmpeg',
                '-f', 'f64le', # input 64bit float little endian 
                '-ar', '44100', # inpt samplerate 44100 Hz
                '-ac','1', # input 1 channel (mono)
                '-i', '-', # inputfile via pipe
                '-y', #  overwrite outputfile if it already exists
                filename]
    pipe = sp.Popen(out_cmds, stdin=sp.PIPE, stdout=sp.PIPE, stderr=sp.PIPE) 
    pipe.stdin.write(data)


data = (np.random.randint(low=-32000, high=32000, size=44100*60)/32678).astype('<f8')

data2audiofile('ffmpeg_mp3.mp3',data)
data2audiofile('ffmpeg_wav.wav',data)
sf.write('sf_wav.wav',data,44100)

此处以 audacity 显示生成的文件：

Answer 1

您需要关闭pipe.stdin并等待子进程结束。

关闭 pipe.stdin 冲洗 stdin 管道。
主题在这里解释：Writing to a python subprocess pipe：

The key it to close stdin (flush and send EOF) before calling wait

在pipe.stdin.write(data)之后添加以下代码行：

pipe.stdin.close()
pipe.wait()

您也可以尝试在 sp.Popen 中设置较大的缓冲区大小：

pipe = sp.Popen(out_cmds, stdin=sp.PIPE, stdout=sp.PIPE, stderr=sp.PIPE, bufsize=10**8)

当我将带有 ffmpeg 的 numpy 数组转换为音频文件 (python) 时，为什么 mp3/wav 持续时间不同？

Why is mp3/wav duration different when I convert a numpy array with ffmpeg into audiofile (python)?

python

mp3

ffmpeg

wav

soundfile