螺旋问题,为什么我在这个使用 Keras 的神经网络中损失会增加?

Spiral problem, why does my loss increase in this neural network using Keras?

我正在尝试使用 Keras 解决螺旋问题,使用 3 个螺旋而不是 2 个,使用我用于 2 的类似策略。问题是我的损失现在呈指数增长,而不是随着我用于2 个螺旋(神经网络结构有 3 个输出而不是二进制)。如果有人可以提供帮助,我不太确定这个问题会发生什么?我已经尝试过各种时期、学习率、批量大小。

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
from tensorflow.keras.optimizers import RMSprop

from Question1.utils import create_neural_network, create_test_data

EPOCHS = 250
BATCH_SIZE = 20


def main():
    df = three_spirals(1000)

    # Set-up data
    x_train = df[['x-coord', 'y-coord']].values
    y_train = df['class'].values

    # Don't need y_test, can inspect visually if it worked or not
    x_test = create_test_data()

    # Scale data
    scaler = MinMaxScaler()
    scaler.fit(x_train)
    x_train = scaler.transform(x_train)
    x_test = scaler.transform(x_test)

    relu_model = create_neural_network(layers=3,
                                       neurons=[40, 40, 40],
                                       activation='relu',
                                       optimizer=RMSprop(learning_rate=0.001),
                                       loss='categorical_crossentropy',
                                       outputs=3)

    # Train networks
    relu_model.fit(x=x_train, y=y_train, epochs=EPOCHS, verbose=1, batch_size=BATCH_SIZE)

    # Predictions on test data
    relu_predictions = relu_model.predict_classes(x_test)

    models = [relu_model]
    test_predictions = [relu_predictions]

    # Plot
    plot_data(models, test_predictions)

这里是 create_neural_network 函数:

def create_neural_network(layers, neurons, activation, optimizer, loss, outputs=1):
    if layers != len(neurons):
        raise ValueError("Number of layers doesn't much the amount of neuron layers.")

    model = Sequential()

    for i in range(layers):
        model.add(Dense(neurons[i], activation=activation))

    # Output
    if outputs == 1:
        model.add(Dense(outputs))
    else:
        model.add(Dense(outputs, activation='softmax'))

    model.compile(optimizer=optimizer,
                  loss=loss)

    return model

我已经解决了,对于输出数据,它不像只需要一列的二进制 classification。对于 multi classification,您需要为每个 class 要 classify 的列...所以我的 y 可能是 0、1、2 是不正确的。执行此操作的正确方法是设置 y0、y1、y2,如果它符合特定 class,则为 1;如果不符合,则为 0。