如何从音频中提取特征到数据集?
How can I extract features from audio to dataset?
我的桌面上有一个文件夹,里面有187个WAV格式的音频文件。我想从音频文件中提取特征,我执行了以下代码来提取音频特征并将特征保存在 CSV 文件中,但我获得了没有价值的 excel 文件,它只包含每一列的标题和len(audio-files) 的输出是 0..但它必须是 187。我该如何解决这个问题???
from glob import glob
data_dir = './audio featur-extraction\audio-setA/'
audio_files = glob(data_dir + '*.wav')
len(audio_files)
from librosa import feature
import numpy as np
fn_list_i = [
feature.chroma_stft,
feature.spectral_centroid,
feature.spectral_bandwidth,
feature.spectral_rolloff,
]
fn_list_ii = [
feature.zero_crossing_rate
]
def get_feature_vector(y,sr):
feat_vect_i = [ np.mean(funct(y,sr)) for funct in fn_list_i]
feat_vect_ii = [ np.mean(funct(y)) for funct in fn_list_ii]
feature_vector = feat_vect_i + feat_vect_ii
return feature_vector
#build the matrix with normal audios featurized
audios_feat = []
for file in audio_files:
'''
y is the time series array of the audio file, a 1D np.ndarray
sr is the sampling rate, a number
'''
y,sr = librosa.load(file,sr=None)
feature_vector = get_feature_vector(y, sr)
audios_feat.append(feature_vector)
print('.', end= " ")
print(audios_feat)
#.........................
import csv
norm_output = 'normals_00.csv'
header =[
'chroma_stft',
'spectral_centroid',
'spectral_bandwidth',
'spectral_rolloff',
'zero_crossing_rate',
]
#WARNING : this overwrites the file each time. Be aware of this because feature extraction step takes time.
with open(norm_output,'+w') as f:
csv_writer = csv.writer(f, delimiter = ',')
csv_writer.writerow(header)
csv_writer.writerows(audios_feat)
错误在这一行:
data_dir = './audio featur-extraction\audio-setA/'
这是多余的斜杠。将其替换为反斜杠(与路径的其余部分一样),您就可以开始了。将来:调试您的代码。逐行检查代码并找出错误。如果要处理的路径数组的长度为零,则不会计算任何内容。
应该是
data_dir = r'audio featur-extraction\audio-setA\' # for Windows
或
data_dir = 'audio featur-extraction/audio-setA/' # for Mac / Linux.
更一般地说,使用os.path.joinin您的代码
audio_files = glob(data_dir + '/*.wav')
通过在星号前添加斜线解决
我的桌面上有一个文件夹,里面有187个WAV格式的音频文件。我想从音频文件中提取特征,我执行了以下代码来提取音频特征并将特征保存在 CSV 文件中,但我获得了没有价值的 excel 文件,它只包含每一列的标题和len(audio-files) 的输出是 0..但它必须是 187。我该如何解决这个问题???
from glob import glob
data_dir = './audio featur-extraction\audio-setA/'
audio_files = glob(data_dir + '*.wav')
len(audio_files)
from librosa import feature
import numpy as np
fn_list_i = [
feature.chroma_stft,
feature.spectral_centroid,
feature.spectral_bandwidth,
feature.spectral_rolloff,
]
fn_list_ii = [
feature.zero_crossing_rate
]
def get_feature_vector(y,sr):
feat_vect_i = [ np.mean(funct(y,sr)) for funct in fn_list_i]
feat_vect_ii = [ np.mean(funct(y)) for funct in fn_list_ii]
feature_vector = feat_vect_i + feat_vect_ii
return feature_vector
#build the matrix with normal audios featurized
audios_feat = []
for file in audio_files:
'''
y is the time series array of the audio file, a 1D np.ndarray
sr is the sampling rate, a number
'''
y,sr = librosa.load(file,sr=None)
feature_vector = get_feature_vector(y, sr)
audios_feat.append(feature_vector)
print('.', end= " ")
print(audios_feat)
#.........................
import csv
norm_output = 'normals_00.csv'
header =[
'chroma_stft',
'spectral_centroid',
'spectral_bandwidth',
'spectral_rolloff',
'zero_crossing_rate',
]
#WARNING : this overwrites the file each time. Be aware of this because feature extraction step takes time.
with open(norm_output,'+w') as f:
csv_writer = csv.writer(f, delimiter = ',')
csv_writer.writerow(header)
csv_writer.writerows(audios_feat)
错误在这一行:
data_dir = './audio featur-extraction\audio-setA/'
这是多余的斜杠。将其替换为反斜杠(与路径的其余部分一样),您就可以开始了。将来:调试您的代码。逐行检查代码并找出错误。如果要处理的路径数组的长度为零,则不会计算任何内容。
应该是
data_dir = r'audio featur-extraction\audio-setA\' # for Windows
或
data_dir = 'audio featur-extraction/audio-setA/' # for Mac / Linux.
更一般地说,使用os.path.joinin您的代码
audio_files = glob(data_dir + '/*.wav')
通过在星号前添加斜线解决