使用具有 MFCC 特征的 kohonen 网络进行语音识别。我如何设置神经元与其权重之间的距离？

Question

我不知道如何在地图中设置每个神经元的定位。这是一个神经元和地图：

typedef struct _neuron
{
    mfcc_frame *frames;
    char *name;
    double *weights;
    int num_weights;
    int x;
    int y;
} neuron;
typedef struct _map
{
neuron *lattice;
    int latice_size;
    double mapRadius;
    int sideX, sideY; 
    int scale;
} map;

如果我有多个相同的词，如何计算模式输入（词）和我的神经元之间的距离。

我不确定重量。我将权重定义为单词的 mfcc 特征量，但在训练中我需要根据神经元之间的距离更新此权重。我正在使用神经元之间的欧几里得距离。但疑惑是如何更新权重。这里是init map和neurons

的代码

void init_neuron(neuron *n, int x, int y, mfcc_frame *mfcc_frames, unsigned int n_frames, char *name){

double r;
register int i, j;
n->frames = mfcc_frames;
n->num_weights = n_frames;
n->x = x; 
n->y = y;

n->name = malloc (strlen(name) * sizeof(char));
strcpy(n->name, name);
n->weights= malloc (n_frames * sizeof (double));

for(i = 0; i < n_frames; i++)
    for(j = 0; j < N_MFCC; j++)
        n->weights[i] = mfcc_frames[i].features[j];

printf("%s lattice %d, %d\n", n->name, n->x, n->y);

}

初始化地图：

map* init_map(int sideX, int sideY, int scale){
register int i, x, y;
char *name = NULL;
void **word_adresses;
unsigned int n = 0, count = 0;
int aux = 0;
word *words = malloc(sizeof(word));

map *_map = malloc(sizeof(map));
_map->latice_size = sideX * sideY;
_map->sideX       = sideX;
_map->sideY       = sideY; 
_map->scale       = scale;
_map->lattice     = malloc(_map->latice_size * sizeof(neuron));
mt_seed ();

if ((n = get_list(words))){
    word_adresses = malloc(n * sizeof(void *));
    while (words != NULL){
        x = mt_rand() %sideX;
        y = mt_rand() %sideY;
        printf("y : %d  x: %d\n", y, x);
        init_neuron(_map->lattice + y * sideX + x, x, y, words->frames, words->n, words->name);

        word_adresses[count++] = words;     
        words = words->next;
    }
    for (i = 0; i < count; i++)
        free(word_adresses[i]);
    free(word_adresses);
    aux++;
}

return _map;

}

Answer 1

在 Kohonen SOM 中，权重在特征 space 中，因此这意味着每个神经元包含一个原型向量。如果输入是 12 个 MFCC，那么每个输入可能看起来像一个包含 12 个双精度值的向量，这意味着每个神经元有 12 个值，每个 MFCC 一个。给定一个输入，您找到最佳匹配单元，然后将该神经元的 12 个码本值根据学习率向输入向量移动少量。

使用具有 MFCC 特征的 kohonen 网络进行语音识别。我如何设置神经元与其权重之间的距离？

Speech recognition using kohonen network with MFCC features. How I set a distance between the neurons and their weights?

c

speech-recognition

som

mfcc