如何在 .wav 文件末尾添加几秒的静音?
How to add seconds of silence at the end of a .wav file?
我有 1,440 个音频文件要输入神经网络。问题是它们的长度并不相同。我使用了发布于:
的答案
但是好像不行。我想在我的文件末尾添加几秒钟的沉默,然后 trim 它们都是 5 秒长。有人可以帮我解决这个问题吗?
(我也尝试过使用 pysox,但这给了我 This install of SoX cannot process .wav files.
错误。)
我正在为此使用 Google Colab。代码是:
import wave, os, glob
from pydub import AudioSegment
from pydub.playback import play
path = 'drive/MyDrive/Ravdess/Sad' #This is the folder from my Google Drive which has the audio files
count = 0
for filename in glob.glob(os.path.join(path, '*.wav')):
w = wave.open(filename, 'r')
d = w.readframes(w.getnframes())
frames = w.getnframes()
rate = w.getframerate()
duration = frames/float(rate)
count+=1
print(filename, "count =", count, "duration = ", duration)
audio_in_file = filename
audio_out_file = "out.wav"
new_duration = duration
#Only append silence until time = 5 seconds.
one_sec = AudioSegment.silent(duration=2000) #duration in milliseconds
song = AudioSegment.from_wav(audio_in_file)
final_song = one_sec + song
new_frames = w.getnframes()
new_rate = w.getframerate()
new_duration = new_frames/float(rate)
final_song.export(audio_out_file, format="wav")
print(final_song, "count =", count, "new duration = ", new_duration)
w.close()
这给出了输出:
drive/MyDrive/Ravdess/Sad/03-01-04-01-02-01-01.wav count = 1 duration = 3.5035
<pydub.audio_segment.AudioSegment object at 0x7fd5b7ca06a0> count = 1 new duration = 3.5035
drive/MyDrive/Ravdess/Sad/03-01-04-01-02-02-01.wav count = 2 duration = 3.370041666666667
<pydub.audio_segment.AudioSegment object at 0x7fd5b7cbc860> count = 2 new duration = 3.370041666666667
... (and so on for all the files)
既然你已经在使用 pydub
,我会这样做:
from pydub import AudioSegment
from pydub.playback import play
input_wav_file = "/path/to/input.wav"
output_wav_file = "/path/to/output.wav"
target_wav_time = 5 * 1000 # 5 seconds (or 5000 milliseconds)
original_segment = AudioSegment.from_wav(input_wav_file)
silence_duration = target_wav_time - len(original_segment)
silenced_segment = AudioSegment.silent(duration=silence_duration)
combined_segment = original_segment + silenced_segment
combined_segment.export(output_wav_file, format="wav")
我有 1,440 个音频文件要输入神经网络。问题是它们的长度并不相同。我使用了发布于:
的答案但是好像不行。我想在我的文件末尾添加几秒钟的沉默,然后 trim 它们都是 5 秒长。有人可以帮我解决这个问题吗?
(我也尝试过使用 pysox,但这给了我 This install of SoX cannot process .wav files.
错误。)
我正在为此使用 Google Colab。代码是:
import wave, os, glob
from pydub import AudioSegment
from pydub.playback import play
path = 'drive/MyDrive/Ravdess/Sad' #This is the folder from my Google Drive which has the audio files
count = 0
for filename in glob.glob(os.path.join(path, '*.wav')):
w = wave.open(filename, 'r')
d = w.readframes(w.getnframes())
frames = w.getnframes()
rate = w.getframerate()
duration = frames/float(rate)
count+=1
print(filename, "count =", count, "duration = ", duration)
audio_in_file = filename
audio_out_file = "out.wav"
new_duration = duration
#Only append silence until time = 5 seconds.
one_sec = AudioSegment.silent(duration=2000) #duration in milliseconds
song = AudioSegment.from_wav(audio_in_file)
final_song = one_sec + song
new_frames = w.getnframes()
new_rate = w.getframerate()
new_duration = new_frames/float(rate)
final_song.export(audio_out_file, format="wav")
print(final_song, "count =", count, "new duration = ", new_duration)
w.close()
这给出了输出:
drive/MyDrive/Ravdess/Sad/03-01-04-01-02-01-01.wav count = 1 duration = 3.5035
<pydub.audio_segment.AudioSegment object at 0x7fd5b7ca06a0> count = 1 new duration = 3.5035
drive/MyDrive/Ravdess/Sad/03-01-04-01-02-02-01.wav count = 2 duration = 3.370041666666667
<pydub.audio_segment.AudioSegment object at 0x7fd5b7cbc860> count = 2 new duration = 3.370041666666667
... (and so on for all the files)
既然你已经在使用 pydub
,我会这样做:
from pydub import AudioSegment
from pydub.playback import play
input_wav_file = "/path/to/input.wav"
output_wav_file = "/path/to/output.wav"
target_wav_time = 5 * 1000 # 5 seconds (or 5000 milliseconds)
original_segment = AudioSegment.from_wav(input_wav_file)
silence_duration = target_wav_time - len(original_segment)
silenced_segment = AudioSegment.silent(duration=silence_duration)
combined_segment = original_segment + silenced_segment
combined_segment.export(output_wav_file, format="wav")