使用 Python 的 WAV 文件修饰符
WAV file modifier using Python
我编写了一个简单的 Python 程序来读取波形文件,并在更改后将其作为新文件存储。
import codecs, wave
#convert a number to its two's complemented value (For positive it is equal itself)
def convert_to_twos(value, wid_len=16):
if value < 0 :
value = value + (1 << wid_len)
return value
#receive the value of a two's complemented number.
def twos_back_value(value, wid_len=16):
if value & (1 << wid_len -1):
value = value - (1 << wid_len)
return value
#opening files
input_file = wave.open(r"<address of input wave file>", 'r')
output_file = wave.open(r"<an address for output wave file>", 'w')
#Get input file parameters and set them to the output file after modifing the channel number.
out_params = [None, None, None, None, None, None]
in_params = input_file.getparams()
out_params[0] = 1 # I want to have a mono type wave file in output. so I set the channels = 1
out_params[1] = in_params[1] #Frame Width
out_params[2] = in_params[2] #Sample Rate
out_params[3] = in_params[3] #Number of Frames
out_params[4] = in_params[4] #Type
out_params[5] = in_params[5] #Compressed or not
output_file.setparams(out_params)
#reading frames from first file and storing in the second file
for frame in range(out_params[2]):
value = int(codecs.getencoder('hex')(input_file.readframes(1))[0][:4], 16) #converting first two bytes of each frame (let assume each channel has two bytes frame length) to int (from byte string).
t_back_value = twos_back_value( value ,out_params[1]*8)
new_value = int(t_back_value * 1)
new_twos = convert_to_twos(new_value, out_params[1]*8)
to_write = new_twos.to_bytes((new_twos.bit_length() + 7) // 8, 'big')
output_file.writeframes(to_write)
#closing files
input_file.close()
output_file.close()
问题是当我运行上面的程序和播放输出文件时,我只能听到噪音而没有其他声音! (虽然我希望同一个文件只在一个通道模式下!)
更新:
我发现了一些奇怪的东西。根据文档,函数 readframes(n)
最多读取和 return 音频的 n 帧,作为字节串 。所以我希望这个函数在 return 中只有十六进制值。但实际上我可以看到一些奇怪的非十六进制值:
read_frame = input_file.readframes(1)
print (read_frame)
print (codecs.getencoder('hex')(read_frame)[0])
print ("")
以上代码,在 for 循环中 return this:
b'\xe3\x00\xc7\xf5'
b'e300c7f5'
b'D\xe8\xa1\xfd'
b'44e8a1fd'
b'\xde\x08\xb2\x1c'
b'de08b21c'
b'\x17\xea\x10\xe9'
b'17ea10e9'
b'{\xf7\xbc\xf5'
b'7bf7bcf5'
b'*\xf6K\x08'
b'2af64b08'
如您所见,read_frame
中有一些非十六进制值! (例如,*、}、D、...)。这些是什么?
您看到的值是每个帧的四个字节,即第一个通道两个字节和第二个通道两个字节。对于单声道 WAV,您只会看到两个字节。
以下方法应该会让您走上正确的道路。您需要使用 Python 的 struct
库将二进制帧值转换为有符号整数。然后您可以根据需要操作它们。对于我的例子,我简单地乘以 2/3:
import wave
import codecs
import struct
#opening files
input_file = wave.open(r"sample.wav", 'rb')
output_file = wave.open(r"sample_out.wav", 'wb')
#Get input file parameters and set them to the output file after modifing the channel number.
in_params = list(input_file.getparams())
out_params = in_params[:]
out_params[0] = 1
output_file.setparams(out_params)
nchannels, sampwidth, framerate, nframes, comptype, compname = in_params
format = '<{}h'.format(nchannels)
#reading frames from first file and storing in the second file
for index in range(nframes):
frame = input_file.readframes(1)
data = struct.unpack(format, frame)
value = data[0] # first (left) channel only
value = (value * 2) // 3 # apply a simple function to each value
output_file.writeframes(struct.pack('<h', value))
#closing files
input_file.close()
output_file.close()
请注意,像这样一次处理一帧波形文件会非常慢。可以通过减少对 writeframes
.
的调用次数来加快速度
format
保存解压二进制值所需的格式。对于 2 声道 WAV 文件,这将包含 4 个字节。 format
然后将配置为 <hh
,这意味着使用 struct.unpack
将产生两个字段,每个字段包含每个通道的整数表示。所以四个字节变成了两个整数的列表,每个整数一个。
我编写了一个简单的 Python 程序来读取波形文件,并在更改后将其作为新文件存储。
import codecs, wave
#convert a number to its two's complemented value (For positive it is equal itself)
def convert_to_twos(value, wid_len=16):
if value < 0 :
value = value + (1 << wid_len)
return value
#receive the value of a two's complemented number.
def twos_back_value(value, wid_len=16):
if value & (1 << wid_len -1):
value = value - (1 << wid_len)
return value
#opening files
input_file = wave.open(r"<address of input wave file>", 'r')
output_file = wave.open(r"<an address for output wave file>", 'w')
#Get input file parameters and set them to the output file after modifing the channel number.
out_params = [None, None, None, None, None, None]
in_params = input_file.getparams()
out_params[0] = 1 # I want to have a mono type wave file in output. so I set the channels = 1
out_params[1] = in_params[1] #Frame Width
out_params[2] = in_params[2] #Sample Rate
out_params[3] = in_params[3] #Number of Frames
out_params[4] = in_params[4] #Type
out_params[5] = in_params[5] #Compressed or not
output_file.setparams(out_params)
#reading frames from first file and storing in the second file
for frame in range(out_params[2]):
value = int(codecs.getencoder('hex')(input_file.readframes(1))[0][:4], 16) #converting first two bytes of each frame (let assume each channel has two bytes frame length) to int (from byte string).
t_back_value = twos_back_value( value ,out_params[1]*8)
new_value = int(t_back_value * 1)
new_twos = convert_to_twos(new_value, out_params[1]*8)
to_write = new_twos.to_bytes((new_twos.bit_length() + 7) // 8, 'big')
output_file.writeframes(to_write)
#closing files
input_file.close()
output_file.close()
问题是当我运行上面的程序和播放输出文件时,我只能听到噪音而没有其他声音! (虽然我希望同一个文件只在一个通道模式下!)
更新:
我发现了一些奇怪的东西。根据文档,函数 readframes(n)
最多读取和 return 音频的 n 帧,作为字节串 。所以我希望这个函数在 return 中只有十六进制值。但实际上我可以看到一些奇怪的非十六进制值:
read_frame = input_file.readframes(1)
print (read_frame)
print (codecs.getencoder('hex')(read_frame)[0])
print ("")
以上代码,在 for 循环中 return this:
b'\xe3\x00\xc7\xf5'
b'e300c7f5'
b'D\xe8\xa1\xfd'
b'44e8a1fd'
b'\xde\x08\xb2\x1c'
b'de08b21c'
b'\x17\xea\x10\xe9'
b'17ea10e9'
b'{\xf7\xbc\xf5'
b'7bf7bcf5'
b'*\xf6K\x08'
b'2af64b08'
如您所见,read_frame
中有一些非十六进制值! (例如,*、}、D、...)。这些是什么?
您看到的值是每个帧的四个字节,即第一个通道两个字节和第二个通道两个字节。对于单声道 WAV,您只会看到两个字节。
以下方法应该会让您走上正确的道路。您需要使用 Python 的 struct
库将二进制帧值转换为有符号整数。然后您可以根据需要操作它们。对于我的例子,我简单地乘以 2/3:
import wave
import codecs
import struct
#opening files
input_file = wave.open(r"sample.wav", 'rb')
output_file = wave.open(r"sample_out.wav", 'wb')
#Get input file parameters and set them to the output file after modifing the channel number.
in_params = list(input_file.getparams())
out_params = in_params[:]
out_params[0] = 1
output_file.setparams(out_params)
nchannels, sampwidth, framerate, nframes, comptype, compname = in_params
format = '<{}h'.format(nchannels)
#reading frames from first file and storing in the second file
for index in range(nframes):
frame = input_file.readframes(1)
data = struct.unpack(format, frame)
value = data[0] # first (left) channel only
value = (value * 2) // 3 # apply a simple function to each value
output_file.writeframes(struct.pack('<h', value))
#closing files
input_file.close()
output_file.close()
请注意,像这样一次处理一帧波形文件会非常慢。可以通过减少对 writeframes
.
format
保存解压二进制值所需的格式。对于 2 声道 WAV 文件,这将包含 4 个字节。 format
然后将配置为 <hh
,这意味着使用 struct.unpack
将产生两个字段,每个字段包含每个通道的整数表示。所以四个字节变成了两个整数的列表,每个整数一个。