在 python 中重叠的音频文件中制作块
Make chunks in Audio files with overlap in python
我想从我的音频文件中制作块,以便在块之间重叠。例如,如果每个块的长度为 4 秒,第一个块从 0 到 4 开始,重叠步长为 1 秒,则第二个块应从 3 到 7.According 到此 如何拼接音频文件(wav 格式)转换为 python 中的 1 秒拼接?
,我使用 pydub
模块和 make_chunks(your_audio_file_object, chunk_length_ms)
方法,但它在块之间没有重叠,只是将音频文件切成固定长度的块。有人对此有想法吗?谢谢
这是一种方法:
import numpy as np
from scipy.io import wavfile
frequency, signal = wavfile.read(path)
slice_length = 4 # in seconds
overlap = 1 # in seconds
slices = np.arange(0, len(signal)/frequency, slice_length-overlap, dtype=np.int)
for start, end in zip(slices[:-1], slices[1:]):
start_audio = start * frequency
end_audio = (end + overlap)* frequency
audio_slice = signal[int(start_audio): int(end_audio)]
本质上,我们做了以下事情:
- 加载文件及其对应的频率。为了举例,我假设它是单通道的,多通道它也可以工作,只是更多的代码。
- 定义所需的切片长度和重叠。该数组将为我们提供每个音频片段的开始。通过进一步压缩并添加重叠,我们得到了所需的块。
要让自己相信切片有效,请检查此代码段:
slice_length = 4 # in seconds
overlap = 1 # in seconds
slices = np.arange(0, 26, slice_length-overlap, dtype=np.int) # 26 is arbitrary
frequency = 1
for start, end in zip(slices[:-1], slices[1:]):
start_audio = start * frequency
end_audio = (end + overlap) * frequency
print(start_audio, end_audio)
输出:
0 4
3 7
6 10
9 13
12 16
15 19
18 22
21 25
我想从我的音频文件中制作块,以便在块之间重叠。例如,如果每个块的长度为 4 秒,第一个块从 0 到 4 开始,重叠步长为 1 秒,则第二个块应从 3 到 7.According 到此 如何拼接音频文件(wav 格式)转换为 python 中的 1 秒拼接?
,我使用 pydub
模块和 make_chunks(your_audio_file_object, chunk_length_ms)
方法,但它在块之间没有重叠,只是将音频文件切成固定长度的块。有人对此有想法吗?谢谢
这是一种方法:
import numpy as np
from scipy.io import wavfile
frequency, signal = wavfile.read(path)
slice_length = 4 # in seconds
overlap = 1 # in seconds
slices = np.arange(0, len(signal)/frequency, slice_length-overlap, dtype=np.int)
for start, end in zip(slices[:-1], slices[1:]):
start_audio = start * frequency
end_audio = (end + overlap)* frequency
audio_slice = signal[int(start_audio): int(end_audio)]
本质上,我们做了以下事情:
- 加载文件及其对应的频率。为了举例,我假设它是单通道的,多通道它也可以工作,只是更多的代码。
- 定义所需的切片长度和重叠。该数组将为我们提供每个音频片段的开始。通过进一步压缩并添加重叠,我们得到了所需的块。
要让自己相信切片有效,请检查此代码段:
slice_length = 4 # in seconds
overlap = 1 # in seconds
slices = np.arange(0, 26, slice_length-overlap, dtype=np.int) # 26 is arbitrary
frequency = 1
for start, end in zip(slices[:-1], slices[1:]):
start_audio = start * frequency
end_audio = (end + overlap) * frequency
print(start_audio, end_audio)
输出:
0 4
3 7
6 10
9 13
12 16
15 19
18 22
21 25