将特定 SNR 的第二个音频片段混合到 Python 中的原始音频文件

Mix second audio clip at specific SNR to original audio file in Python

我有两个音频文件想混入 Python。

我将原始音频称为“音频 A”,要混合的音频称为“音频 B”。我能够按照 here:

的描述将特定 SNR 的白噪声添加到音频 A 信号
audio, sr = librosa.load(file_name, sr=None, res_type='kaiser_fast')
power = audio ** 2  # Calculate power
signalpower_db = 10 * np.log10(power)  # convert power to dB
#snr_dB = 0  # add SNR of specified dB
signal_average_power = np.mean(power)  # Calculate signal power)
signal_averagepower_dB = 10 * np.log10(signal_average_power)  # convert signal power to dB
noise_dB = signal_averagepower_dB - snr_dB  # Calculate noise
noise_watts = 10 ** (noise_dB / 10)  # Convert noise from dB to watts
# Generate sample of white noise
mean_noise = 0
noise = np.random.normal(mean_noise, np.sqrt(noise_watts), len(audio))

noise_signal = (audio + noise) / 1.3  #  To prevent clipping of signal

在此处给出的代码中,我做了 0 dB SNR。

我现在如何使用“音频 B”作为噪声源而不是白噪声并获得 0dB 的信噪比。即,如何将np.random.normal噪声替换为音频B作为在SNR = 0的原始信号“音频A”中注入的噪声源?

真诚感谢任何帮助和指导!

您必须确保两个音频的时长相同 然后您可以计算出提供所需 SNR 的增益是多少。如果你简单地添加那么你将改变输入信号的能量,所以我调整信号的信号能量和噪声能量,以便噪声信号具有与干净信号相同的能量(假设噪声是不相关的)

def mix_audio(signal, noise, snr):
    # if the audio is longer than the noise
    # play the noise in repeat for the duration of the audio
    noise = noise[np.arange(len(signal)) % len(noise)]
    
    # if the audio is shorter than the noi
    # this is important if loading resulted in 
    # uint8 or uint16 types, because it would cause overflow
    # when squaring and calculating mean
    noise = noise.astype(np.float32)
    signal = signal.astype(np.float32)
    
    # get the initial energy for reference
    signal_energy = np.mean(signal**2)
    noise_energy = np.mean(noise**2)
    # calculates the gain to be applied to the noise 
    # to achieve the given SNR
    g = np.sqrt(10.0 ** (-snr/10) * signal_energy / noise_energy)
    
    # Assumes signal and noise to be decorrelated
    # and calculate (a, b) such that energy of 
    # a*signal + b*noise matches the energy of the input signal
    a = np.sqrt(1 / (1 + g**2))
    b = np.sqrt(g**2 / (1 + g**2))
    print(g, a, b)
    # mix the signals
    return a * signal + b * noise

这里是一个如何使用函数的例子

signal = np.random.randint(0, 2, 10**7) - 0.5
# use some non-standard noise distribution
noise = np.sin(np.random.randn(6*10**7))
noisy = mix_audio(signal, noise, 10)
plt.hist(noisy, bins=300);