如何获取wav文件中的频率列表
How to get a list of frequencies in a wav file
我正在尝试解码一些音频,这些音频基本上是两个频率(0 为 200hz,1 为 800hz),可直接转换为二进制。
A sample of the audio
此示例翻译为“1001011”。
第三个频率是 1600hz 作为位之间的除数。
我找不到任何有用的东西我确实找到了一些东西,但要么已经过时,要么就是直接不起作用我真的很绝望。
我制作了一个可以为此编码生成音频的示例代码(以测试解码器):
import math
import wave
import struct
audio = []
sample_rate = 44100.0
def split(word):
return [char for char in word]
def append_sinewave(
freq=440.0,
duration_milliseconds=10,
volume=1.0):
global audio
num_samples = duration_milliseconds * (sample_rate / 1000.0)
for x in range(int(num_samples)):
audio.append(volume * math.sin(2 * math.pi * freq * ( x / sample_rate )))
return
def save_wav(file_name):
wav_file=wave.open(file_name,"w")
nchannels = 1
sampwidth = 2
nframes = len(audio)
comptype = "NONE"
compname = "not compressed"
wav_file.setparams((nchannels, sampwidth, sample_rate, nframes, comptype, compname))
for sample in audio:
wav_file.writeframes(struct.pack('h', int( sample * 32767.0 )))
wav_file.close()
return
print("Input data!\n(binary)")
data=input(">> ")
dataL = []
dataL = split(data)
for x in dataL:
if x == "0":
append_sinewave(freq=200)
elif x == "1":
append_sinewave(freq=800)
append_sinewave(freq=1600,duration_milliseconds=5)
print("Making "+str(x)+" beep")
print("\nWriting to file this may take a while!")
save_wav("output.wav")
感谢您的提前帮助!
我想我明白你在尝试什么。根据您的编码器脚本,我假设每个 bit
在您的 wave 文件中转换为 10 毫秒,并以 5ms 1600hz 音调作为一种分隔符。如果这些持续时间是固定的,您可以简单地使用 scipy
和 numpy
来分割音频并解码每个片段。
鉴于上面的编码器脚本为字节串生成 105 毫秒(7 * 15 毫秒)单声道 output.wav
:1001011
如果要忽略定界频率,我们的目标应该是 return 代表每个频率的列表 bit
:
[800, 200, 200, 800, 200, 800, 800]
我们可以使用 scipy
读取音频并使用 numpy
对音频片段执行 FFT 以获得每个片段的频率:
from scipy.io import wavfile as wav
import numpy as np
rate, data = wav.read('./output.wav')
# 15ms chunk includes delimiting 5ms 1600hz tone
duration = 0.015
# calculate the length of our chunk in the np.array using sample rate
chunk = int(rate * duration)
# length of delimiting 1600hz tone
offset = int(rate * 0.005)
# number of bits in the audio data to decode
bits = int(len(data) / chunk)
def get_freq(bit):
# start position of the current bit
strt = (chunk * bit)
# remove the delimiting 1600hz tone
end = (strt + chunk) - offset
# slice the array for each bit
sliced = data[strt:end]
w = np.fft.fft(sliced)
freqs = np.fft.fftfreq(len(w))
# Find the peak in the coefficients
idx = np.argmax(np.abs(w))
freq = freqs[idx]
freq_in_hertz = abs(freq * rate)
return freq_in_hertz
decoded_freqs = [get_freq(bit) for bit in range(bits)]
产量
[800.0, 200.0, 200.0, 800.0, 200.0, 800.0, 800.0]
转换为bits/bytes:
bitsarr = [1 if freq == 800 else 0 for freq in decoded_freqs]
byte_array = bytearray(bitsarr)
decoded = bytes(a_byte_array)
print(decoded, type(decoded))
产量
b'\x01\x00\x00\x01\x00\x01\x01' <class 'bytes'>
有关推导峰值频率的更多信息,请参阅 this question
我正在尝试解码一些音频,这些音频基本上是两个频率(0 为 200hz,1 为 800hz),可直接转换为二进制。 A sample of the audio
此示例翻译为“1001011”。 第三个频率是 1600hz 作为位之间的除数。
我找不到任何有用的东西我确实找到了一些东西,但要么已经过时,要么就是直接不起作用我真的很绝望。
我制作了一个可以为此编码生成音频的示例代码(以测试解码器):
import math
import wave
import struct
audio = []
sample_rate = 44100.0
def split(word):
return [char for char in word]
def append_sinewave(
freq=440.0,
duration_milliseconds=10,
volume=1.0):
global audio
num_samples = duration_milliseconds * (sample_rate / 1000.0)
for x in range(int(num_samples)):
audio.append(volume * math.sin(2 * math.pi * freq * ( x / sample_rate )))
return
def save_wav(file_name):
wav_file=wave.open(file_name,"w")
nchannels = 1
sampwidth = 2
nframes = len(audio)
comptype = "NONE"
compname = "not compressed"
wav_file.setparams((nchannels, sampwidth, sample_rate, nframes, comptype, compname))
for sample in audio:
wav_file.writeframes(struct.pack('h', int( sample * 32767.0 )))
wav_file.close()
return
print("Input data!\n(binary)")
data=input(">> ")
dataL = []
dataL = split(data)
for x in dataL:
if x == "0":
append_sinewave(freq=200)
elif x == "1":
append_sinewave(freq=800)
append_sinewave(freq=1600,duration_milliseconds=5)
print("Making "+str(x)+" beep")
print("\nWriting to file this may take a while!")
save_wav("output.wav")
感谢您的提前帮助!
我想我明白你在尝试什么。根据您的编码器脚本,我假设每个 bit
在您的 wave 文件中转换为 10 毫秒,并以 5ms 1600hz 音调作为一种分隔符。如果这些持续时间是固定的,您可以简单地使用 scipy
和 numpy
来分割音频并解码每个片段。
鉴于上面的编码器脚本为字节串生成 105 毫秒(7 * 15 毫秒)单声道 output.wav
:1001011
如果要忽略定界频率,我们的目标应该是 return 代表每个频率的列表 bit
:
[800, 200, 200, 800, 200, 800, 800]
我们可以使用 scipy
读取音频并使用 numpy
对音频片段执行 FFT 以获得每个片段的频率:
from scipy.io import wavfile as wav
import numpy as np
rate, data = wav.read('./output.wav')
# 15ms chunk includes delimiting 5ms 1600hz tone
duration = 0.015
# calculate the length of our chunk in the np.array using sample rate
chunk = int(rate * duration)
# length of delimiting 1600hz tone
offset = int(rate * 0.005)
# number of bits in the audio data to decode
bits = int(len(data) / chunk)
def get_freq(bit):
# start position of the current bit
strt = (chunk * bit)
# remove the delimiting 1600hz tone
end = (strt + chunk) - offset
# slice the array for each bit
sliced = data[strt:end]
w = np.fft.fft(sliced)
freqs = np.fft.fftfreq(len(w))
# Find the peak in the coefficients
idx = np.argmax(np.abs(w))
freq = freqs[idx]
freq_in_hertz = abs(freq * rate)
return freq_in_hertz
decoded_freqs = [get_freq(bit) for bit in range(bits)]
产量
[800.0, 200.0, 200.0, 800.0, 200.0, 800.0, 800.0]
转换为bits/bytes:
bitsarr = [1 if freq == 800 else 0 for freq in decoded_freqs]
byte_array = bytearray(bitsarr)
decoded = bytes(a_byte_array)
print(decoded, type(decoded))
产量
b'\x01\x00\x00\x01\x00\x01\x01' <class 'bytes'>
有关推导峰值频率的更多信息,请参阅 this question