Remove/control 使用 PyAudio 作为振荡器的点击声

Remove/control clicking sound using PyAudio as an oscillator

运行时,音调之间会发出咔嗒声。我不介意太多的咔哒声——它的节奏很悦耳。也就是说...

我看过这个帖子,但还不知道如何将它应用到我的问题中: How to remove pops from concatented sound data in PyAudio

有什么想法吗?感谢您的宝贵时间!

import numpy
import pyaudio
import math
import random


def sine(frequency, length, rate):
    length = int(length * rate)
    factor = float(frequency) * (math.pi * 2) / rate
    waveform = numpy.sin(numpy.arange(length) * factor)
    return waveform


def play_tone(stream, frequency, length, rate=44100):
    chunks = []
    chunks.append(sine(frequency, length, rate))

    chunk = numpy.concatenate(chunks) * .25

    stream.write(chunk.astype(numpy.float32).tostring())


def bassline():
        frequency = 300
        for i in range(1000000):
            play_tone(stream, frequency, .15)
            change = random.choice([-75, -75, -10, 10, 2, 3, 100, -125])
            print (frequency)
            if frequency < 0:
                frequency = random.choice([100, 200, 250, 300])
            else:
                frequency = frequency + change 

if __name__ == '__main__':
    p = pyaudio.PyAudio()
    stream = p.open(format=pyaudio.paFloat32,
                    channels=1, rate=44100, output=4)

bassline()

/编辑

我已经绘制了音调,看起来不连续性存在于每个音调的开始和结束阶段之间的关系中。

First tone

Second tone

有什么解决办法吗?

如两个波形图像所示,当您在频率之间切换时,由于波形振幅的快速变化,您会听到咔嗒声。为了解决这个问题,您需要在更改频率时保持波形的相位。我认为最简单的方法是添加一个变量来记录每次正弦调用后波形周期中的最后位置。结束位置可以用作下一个正弦调用的开始位置。

类似于:

phase_start = phase_position
phase_end = phase_start + length
waveform = numpy.sin(numpy.arange(phase_start, phase_end) * factor)
phase_position = phase_end

注意:我认为这是最简单的答案,但我建议使用您引用的问题中的信息。您应该以弧度为单位保持播放的正弦波的相位。 How to remove pops from concatented sound data in PyAudio

谢谢Ehz and Matthias

最后,我通过在几百毫秒的过程中淡入和淡出每个音调解决了这个问题。这也是控制点击声音的好方法。 fade 越接近 0,咔嗒声越大。

import math
import numpy
import pyaudio


def sine(frequency, length, rate):
    length = int(length * rate)
    factor = (float(frequency) * (math.pi * 2) / rate)
    return numpy.sin(numpy.arange(length) * factor)


def play_tone(stream, frequency, length, rate=44100):
    chunks = [sine(frequency, length, rate)]

    chunk = numpy.concatenate(chunks) * 0.25

    fade = 200.

    fade_in = numpy.arange(0., 1., 1/fade)
    fade_out = numpy.arange(1., 0., -1/fade)

    chunk[:fade] = numpy.multiply(chunk[:fade], fade_in)
    chunk[-fade:] = numpy.multiply(chunk[-fade:], fade_out)

    stream.write(chunk.astype(numpy.float32).tostring())


def test():
    test_freqs = [50, 100, 200, 400, 800, 1200, 2000, 3200]

    for i in range(2):
        for freq in test_freqs:
            play_tone(stream, freq, 1)


if __name__ == '__main__':
    p = pyaudio.PyAudio()
    stream = p.open(format=pyaudio.paFloat32,
                    channels=1, rate=44100, output=1)


test()

点击是由于一个频率的结束波相位与下一个频率的起始波相位不同。请参考下面的两张图片:检查第一波的图表显示结束阶段值约为 -0.96。第二张图片显示了下一个频率,其振幅约为 0.85。如果您不相应地移动每个新波,您会听到频率之间明显的咔哒声。事实证明,有一个非常简单的解决方案。使用 numpy.arcsin() 计算并存储所需的相移,以保持波的和谐运行:

wave_delta_arcsin = 0.0

def sine(frequency, length):
    global wave_delta_arcsin
    length = int(length * rate)
    factor = (math.pi * 2) * float(frequency) / rate
    wave = numpy.sin(numpy.arange(length) * factor + wave_delta_arcsin)
    wave_delta_arcsin = numpy.arcsin(wave[-1])
    return wave