为什么我的 ADPCM 解码器似乎在振荡?

Why does my ADPCM decoder seem to oscillate?

我正在为嵌入式处理器 (ARM Cortex-M4) 编写代码

这段代码的目的是解码Intel/DVI格式(也称为IMA格式)的4位ADPCM。我使用 Python 的 audioop 模块对方波的 ADPCM 样本进行了编码。然后我使用相同的 audioop 模块成功解码了这个样本,它与输入非常匹配。

但是,我无法在我的嵌入式处理器上正确解码输入数据。表示输出的 valpred 值似乎 运行 偏离并在较大的正值和较大的负值之间振荡。这似乎是由 sign 值的行为驱动的。我遇到的问题是,这段代码实际上是 audioop 的 C 实现代码的抄本,删除了 Python 部分。据我所知,该算法是相同的。然而,对于几乎每个输入数据值,它似乎仍处于振荡状态。这显然是由 sign 翻转 vpdiff 值驱动的,但我看不出如何避免这种情况,因为量化步长如此之高(通常最大步长为 88)并且数据似乎确实如此有交替符号。

这是我现在正在使用的实现。 adpcm_step_size 数组包含量化步长(例如 7、8、9 ... 29794、32767),而 adpcm_step_size_adapt 包含步长增量(-1、-1、-1、-1、2 , 4, 6, 8, 重复)。

void audio_adpcm_play(uint8_t *sample_data, uint16_t sample_size)
{
    int sign, delta, step, vpdiff, valpred, index, half;
    uint32_t debug_data;
    uint32_t result;
    uint8_t data = 0x00;

    // Initial state
    half = 0;
    valpred = 0;
    index = 0;
    step = adpcm_step_size[index];

    while(sample_size > 0) {
        // Extract the appropriate word
        if(half) {
            delta = data & 0x0f;
        } else {
            data = *sample_data++;
            delta = (data >> 4) & 0x0f;
            sample_size--;
        }

        half = !half;
        debug_data = delta;

        // Find new index value
        index += adpcm_step_size_adapt[delta];
        if(index < 0)
            index = 0;
        if(index > 88)
            index = 88;

        // Separate sign and magnitude
        sign = delta & 8;
        delta = delta & 7;

        // Compute difference and the new predicted value
        vpdiff = step >> 3;

        if(delta & 4)
            vpdiff += step;
        if(delta & 2)
            vpdiff += step >> 1;
        if(delta & 1)
            vpdiff += step >> 2;

        if(sign)
            valpred -= vpdiff;
        else
            valpred += vpdiff;

        // Clamp values that exceed the valid range
        if(valpred > 32767)
            valpred = 32767;
        else if(valpred < -32768)
            valpred = -32768;

        step = adpcm_step_size[index];

        result = (valpred + 32767) >> AUDIO_CODE_SHIFT;
        uart_printf(DBG_LVL_INFO, \
                "data=%02x,  source_byte=%02x,  samples_rem=%5d,  valpred=%7d,  vpdiff=%5d,  sign=%02x,  delta=%02x,  index=%3d,  step=%3d,  adapt=%3d,  res=%5d/%5d\r\n", \
                debug_data, data, sample_size, valpred, vpdiff, sign, delta, index, step, \
                adpcm_step_size_adapt[delta], result, AUDIO_CODE_DUTY_MAX);
    }
}

这是方波输入的输出;可以看出,当 valpred 应该稳定在给定值时,它会在两个值之间快速振荡。

data=07,  source_byte=f7,  samples_rem= 7999,  valpred=     19,  vpdiff=   30,  sign=00,  delta=07,  index= 16,  step= 34,  adapt=  8,  res=  128/  256
data=0f,  source_byte=f7,  samples_rem= 7998,  valpred=    -44,  vpdiff=   63,  sign=08,  delta=07,  index= 24,  step= 73,  adapt=  8,  res=  127/  256
data=07,  source_byte=f7,  samples_rem= 7998,  valpred=     92,  vpdiff=  136,  sign=00,  delta=07,  index= 32,  step=157,  adapt=  8,  res=  128/  256
data=0f,  source_byte=f7,  samples_rem= 7997,  valpred=   -201,  vpdiff=  293,  sign=08,  delta=07,  index= 40,  step=337,  adapt=  8,  res=  127/  256
data=07,  source_byte=f7,  samples_rem= 7997,  valpred=    430,  vpdiff=  631,  sign=00,  delta=07,  index= 48,  step=724,  adapt=  8,  res=  129/  256
data=0f,  source_byte=f7,  samples_rem= 7996,  valpred=   -927,  vpdiff= 1357,  sign=08,  delta=07,  index= 56,  step=1552,  adapt=  8,  res=  124/  256
data=07,  source_byte=f7,  samples_rem= 7996,  valpred=   1983,  vpdiff= 2910,  sign=00,  delta=07,  index= 64,  step=3327,  adapt=  8,  res=  135/  256
data=0f,  source_byte=f7,  samples_rem= 7995,  valpred=  -4253,  vpdiff= 6236,  sign=08,  delta=07,  index= 72,  step=7132,  adapt=  8,  res=  111/  256
data=07,  source_byte=f7,  samples_rem= 7995,  valpred=   9119,  vpdiff=13372,  sign=00,  delta=07,  index= 80,  step=15289,  adapt=  8,  res=  163/  256
data=0d,  source_byte=d5,  samples_rem= 7994,  valpred= -11903,  vpdiff=21022,  sign=08,  delta=05,  index= 84,  step=22385,  adapt=  4,  res=   81/  256
data=05,  source_byte=d5,  samples_rem= 7994,  valpred=  18876,  vpdiff=30779,  sign=00,  delta=05,  index= 88,  step=32767,  adapt=  4,  res=  201/  256
data=0b,  source_byte=b3,  samples_rem= 7993,  valpred=  -9793,  vpdiff=28669,  sign=08,  delta=03,  index= 87,  step=29794,  adapt= -1,  res=   89/  256
data=03,  source_byte=b3,  samples_rem= 7993,  valpred=  16276,  vpdiff=26069,  sign=00,  delta=03,  index= 86,  step=27086,  adapt= -1,  res=  191/  256
data=0c,  source_byte=c4,  samples_rem= 7992,  valpred= -14195,  vpdiff=30471,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   72/  256
data=04,  source_byte=c4,  samples_rem= 7992,  valpred=  22667,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  216/  256
data=09,  source_byte=9c,  samples_rem= 7991,  valpred=  10381,  vpdiff=12286,  sign=08,  delta=01,  index= 87,  step=29794,  adapt= -1,  res=  168/  256
data=0c,  source_byte=9c,  samples_rem= 7991,  valpred= -23137,  vpdiff=33518,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   37/  256
data=04,  source_byte=4c,  samples_rem= 7990,  valpred=  13725,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  181/  256
data=0c,  source_byte=4c,  samples_rem= 7990,  valpred= -23137,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   37/  256
data=04,  source_byte=4c,  samples_rem= 7989,  valpred=  13725,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  181/  256
data=0c,  source_byte=4c,  samples_rem= 7989,  valpred= -23137,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   37/  256
data=04,  source_byte=4c,  samples_rem= 7988,  valpred=  13725,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  181/  256
data=0c,  source_byte=4c,  samples_rem= 7988,  valpred= -23137,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   37/  256
data=04,  source_byte=4c,  samples_rem= 7987,  valpred=  13725,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  181/  256
data=0c,  source_byte=4c,  samples_rem= 7987,  valpred= -23137,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   37/  256
data=04,  source_byte=4c,  samples_rem= 7986,  valpred=  13725,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  181/  256
data=0c,  source_byte=4c,  samples_rem= 7986,  valpred= -23137,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   37/  256
data=04,  source_byte=4c,  samples_rem= 7985,  valpred=  13725,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  181/  256
data=0c,  source_byte=4c,  samples_rem= 7985,  valpred= -23137,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   37/  256
data=04,  source_byte=4c,  samples_rem= 7984,  valpred=  13725,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  181/  256
data=0c,  source_byte=4c,  samples_rem= 7984,  valpred= -23137,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   37/  256
data=01,  source_byte=14,  samples_rem= 7983,  valpred= -10851,  vpdiff=12286,  sign=00,  delta=01,  index= 87,  step=29794,  adapt= -1,  res=   85/  256
data=04,  source_byte=14,  samples_rem= 7983,  valpred=  22667,  vpdiff=33518,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  216/  256
data=0c,  source_byte=c4,  samples_rem= 7982,  valpred= -14195,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   72/  256
data=04,  source_byte=c4,  samples_rem= 7982,  valpred=  22667,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  216/  256
data=0c,  source_byte=c4,  samples_rem= 7981,  valpred= -14195,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   72/  256
data=04,  source_byte=c4,  samples_rem= 7981,  valpred=  22667,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  216/  256
data=0c,  source_byte=c4,  samples_rem= 7980,  valpred= -14195,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   72/  256
data=04,  source_byte=c4,  samples_rem= 7980,  valpred=  22667,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  216/  256
data=0c,  source_byte=c4,  samples_rem= 7979,  valpred= -14195,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   72/  256
data=04,  source_byte=c4,  samples_rem= 7979,  valpred=  22667,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  216/  256
data=0c,  source_byte=c4,  samples_rem= 7978,  valpred= -14195,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   72/  256
data=04,  source_byte=c4,  samples_rem= 7978,  valpred=  22667,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  216/  256
data=0c,  source_byte=c4,  samples_rem= 7977,  valpred= -14195,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   72/  256
data=04,  source_byte=c4,  samples_rem= 7977,  valpred=  22667,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  216/  256
data=0c,  source_byte=c4,  samples_rem= 7976,  valpred= -14195,  vpdiff=36862,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   72/  256
data=04,  source_byte=c4,  samples_rem= 7976,  valpred=  22667,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  216/  256
data=09,  source_byte=9c,  samples_rem= 7975,  valpred=  10381,  vpdiff=12286,  sign=08,  delta=01,  index= 87,  step=29794,  adapt= -1,  res=  168/  256
data=0c,  source_byte=9c,  samples_rem= 7975,  valpred= -23137,  vpdiff=33518,  sign=08,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=   37/  256
data=04,  source_byte=4c,  samples_rem= 7974,  valpred=  13725,  vpdiff=36862,  sign=00,  delta=04,  index= 88,  step=32767,  adapt=  2,  res=  181/  256

如果我仅每隔一秒采样一次,对于方波几乎可以接受,但对于其他波形则会出现问题。这仍然不是一个可接受的解决方案,但也许它是问题原因的线索。

如果有人有任何想法,我将不胜感激。这几天我一直在为此焦头烂额。

编辑:可以在此处找到 audioop 模块的源代码 https://github.com/python/cpython/blob/master/Modules/audioop.c,ADPCM 解码器是 audioop_adpcm2lin_impl

我设法解决了这个问题。这是由一个愚蠢的错误引起的,一次读取一个字节的 16 位输入数据,然后使用相同的错误解压缩数据在 Python 中产生了正确的结果。但这显然不利于解码器的C实现。

事后看来,我不确定为什么我没有注意到音频文件是它应该的两倍大。