为什么 SparseCategoricalCrossentropy 不适用于此机器学习模型?

Why SparseCategoricalCrossentropy is not working with this machine learning model?

我有一个 .csv 数据库文件,如下所示:

              Day   Hour  N1  N2  N3  N4  N5  ...  N14  N15  N16  N17  N18  N19  N20
0      1996-03-18  15:00   4   9  10  16  21  ...   48   62   66   68   73   76   78
1      1996-03-19  15:00   6  12  15  19  28  ...   63   64   67   69   71   75   77
2      1996-03-21  15:00   2   4   6   7  15  ...   52   54   69   72   73   75   77
3      1996-03-22  15:00   3   8  15  17  19  ...   49   60   61   64   67   68   75
4      1996-03-25  15:00   2  10  11  14  18  ...   55   60   61   66   67   75   79
...           ...    ...  ..  ..  ..  ..  ..  ...  ...  ...  ...  ...  ...  ...  ...
13596  2022-01-04  22:50  17  18  22  26  27  ...   64   65   71   72   73   76   80
13597  2022-01-05  15:00   1   5   8  14  15  ...   47   54   59   67   70   72   76
13598  2022-01-05  22:50   6   7  14  15  16  ...   54   55   59   61   70   71   80
13599  2022-01-06  15:00   9  10  11  17  28  ...   51   55   65   67   72   76   78
13600  2022-01-06  22:50   1   2   6   9  11  ...   51   52   54   67   68   73   75

我找到了这篇文章: https://machinelearningmastery.com/how-to-develop-convolutional-neural-network-models-for-time-series-forecasting/

但我正在尝试通过在最后一层使用 softmax 函数和 SparseCategoricalCrossentropy()[=33= 来开发该 1D CNN 模型的修改版本] 作为损失函数,并通过向该代码添加新函数使其与众不同。

这是我目前的代码以及我正在尝试构建和使用的模型:

# multivariate output 1d cnn example
import os

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'  # or any {'0', '1', '2'}
import warnings

warnings.filterwarnings('ignore')
import pandas as pd
# multivariate output 1d cnn example
from numpy import array
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import *
from tensorflow.keras.losses import *
from tensorflow.keras.layers import *
import tensorflow as tf
from tensorflow.keras import backend as K
from tensorflow.keras.callbacks import ReduceLROnPlateau
from tensorflow.keras.callbacks import ModelCheckpoint


# Define the Required Callback Function
class printlearningrate(tf.keras.callbacks.Callback):
    def on_epoch_end (self, epoch, logs={}):
        optimizer = self.model.optimizer
        lr = K.eval(optimizer.lr)
        Epoch_count = epoch + 1
        print('\n', "Epoch:", Epoch_count, ', Learning Rate: {:.7f}'.format(lr))


printlr = printlearningrate()


# split a multivariate sequence into samples
def split_sequences (sequences, n_steps):
    X, y = list(), list()
    for i in range(len(sequences)):
        # find the end of this pattern
        end_ix = i + n_steps
        # check if we are beyond the dataset
        if end_ix > len(sequences) - 1:
            break
        # gather input and output parts of the pattern
        seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :]
        X.append(seq_x)
        y.append(seq_y)
    return array(X), array(y)


df = pd.read_csv('DrawsDB.csv')

print(df)
# df['Time'] = df[['Day', 'Hour']].agg(' '.join, axis=1)
df.insert(0, 'Time', df[['Day', 'Hour']].agg(' '.join, axis=1))
df.drop(columns=['Day', 'Hour'], inplace=True)
df.set_index('Time', inplace=True)
print(df)

numpy_array = df.to_numpy()

print(type(numpy_array))
print(numpy_array)

# choose a number of time steps
n_steps = 10
# convert into input/output
X, y = split_sequences(numpy_array, n_steps)
print(X.shape, y.shape)

# the dataset knows the number of features, e.g. 2
n_features = X.shape[2]
# Reduce learning rate when nothing happens to lower more the loss:
reduce_lr = ReduceLROnPlateau(monitor='loss', factor=0.9888888888888889,
                              patience=10, min_lr=0.0000001, verbose=1)

epochs = 10
# saving best model every epoch with ModelCheckpoint:
checkpoint_filepath = 'C:\Path\To\Saved\CheckPoint\model\'
model_checkpoint_callback = ModelCheckpoint(
    filepath=checkpoint_filepath,
    monitor='loss',
    save_best_only=True,
    save_weights_only=True,
    verbose=1)

# define model
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=2, activation=LeakyReLU(), input_shape=(n_steps, n_features)))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(50, activation=LeakyReLU()))
model.add(Dense(n_features))
model.compile(optimizer=Nadam(lr=0.09), loss=SparseCategoricalCrossentropy(),
              metrics=['accuracy', mean_squared_error, mean_absolute_error, mean_absolute_percentage_error])

# fit model
model.fit(X, y, epochs=10, verbose=2, callbacks=[printlr, reduce_lr, model_checkpoint_callback])

split_sequences 函数就像它的名字说的那样,它通过仅从数据库中获取 N 行作为输入并尝试从中预测所有 N+1 行来拆分数据库数据库作为输出。

但是,我认为存在问题,因为每次我尝试 运行 python 脚本时都会收到此错误:

tensorflow.python.framework.errors_impl.InvalidArgumentError:  logits and labels must have the same first dimension, got logits shape [32,20] and labels shape [640]
     [[node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits
 (defined at C:\Users\UserName\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\backend.py:5114)
]] [Op:__inference_train_function_1228]

请问您知道如何解决这个问题吗?

提前致谢!

假设标签是整数,它们的形状不适合 SparseCategoricalCrossentropy。检查 docs。 尝试将您的 y 转换为 one-hot 编码标签:

y = tf.keras.utils.to_categorical(y, num_classes=20)

并将损失函数更改为 CategoricalCrossentropy:

model.compile(optimizer=Nadam(lr=0.09), loss=tf.keras.losses.CategoricalCrossentropy(),
              metrics=['accuracy', mean_squared_error, mean_absolute_error, mean_absolute_percentage_error])

它应该可以工作。