为什么 SparseCategoricalCrossentropy 不适用于此机器学习模型?
Why SparseCategoricalCrossentropy is not working with this machine learning model?
我有一个 .csv 数据库文件,如下所示:
Day Hour N1 N2 N3 N4 N5 ... N14 N15 N16 N17 N18 N19 N20
0 1996-03-18 15:00 4 9 10 16 21 ... 48 62 66 68 73 76 78
1 1996-03-19 15:00 6 12 15 19 28 ... 63 64 67 69 71 75 77
2 1996-03-21 15:00 2 4 6 7 15 ... 52 54 69 72 73 75 77
3 1996-03-22 15:00 3 8 15 17 19 ... 49 60 61 64 67 68 75
4 1996-03-25 15:00 2 10 11 14 18 ... 55 60 61 66 67 75 79
... ... ... .. .. .. .. .. ... ... ... ... ... ... ... ...
13596 2022-01-04 22:50 17 18 22 26 27 ... 64 65 71 72 73 76 80
13597 2022-01-05 15:00 1 5 8 14 15 ... 47 54 59 67 70 72 76
13598 2022-01-05 22:50 6 7 14 15 16 ... 54 55 59 61 70 71 80
13599 2022-01-06 15:00 9 10 11 17 28 ... 51 55 65 67 72 76 78
13600 2022-01-06 22:50 1 2 6 9 11 ... 51 52 54 67 68 73 75
但我正在尝试通过在最后一层使用 softmax 函数和 SparseCategoricalCrossentropy()[=33= 来开发该 1D CNN 模型的修改版本] 作为损失函数,并通过向该代码添加新函数使其与众不同。
这是我目前的代码以及我正在尝试构建和使用的模型:
# multivariate output 1d cnn example
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' # or any {'0', '1', '2'}
import warnings
warnings.filterwarnings('ignore')
import pandas as pd
# multivariate output 1d cnn example
from numpy import array
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import *
from tensorflow.keras.losses import *
from tensorflow.keras.layers import *
import tensorflow as tf
from tensorflow.keras import backend as K
from tensorflow.keras.callbacks import ReduceLROnPlateau
from tensorflow.keras.callbacks import ModelCheckpoint
# Define the Required Callback Function
class printlearningrate(tf.keras.callbacks.Callback):
def on_epoch_end (self, epoch, logs={}):
optimizer = self.model.optimizer
lr = K.eval(optimizer.lr)
Epoch_count = epoch + 1
print('\n', "Epoch:", Epoch_count, ', Learning Rate: {:.7f}'.format(lr))
printlr = printlearningrate()
# split a multivariate sequence into samples
def split_sequences (sequences, n_steps):
X, y = list(), list()
for i in range(len(sequences)):
# find the end of this pattern
end_ix = i + n_steps
# check if we are beyond the dataset
if end_ix > len(sequences) - 1:
break
# gather input and output parts of the pattern
seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :]
X.append(seq_x)
y.append(seq_y)
return array(X), array(y)
df = pd.read_csv('DrawsDB.csv')
print(df)
# df['Time'] = df[['Day', 'Hour']].agg(' '.join, axis=1)
df.insert(0, 'Time', df[['Day', 'Hour']].agg(' '.join, axis=1))
df.drop(columns=['Day', 'Hour'], inplace=True)
df.set_index('Time', inplace=True)
print(df)
numpy_array = df.to_numpy()
print(type(numpy_array))
print(numpy_array)
# choose a number of time steps
n_steps = 10
# convert into input/output
X, y = split_sequences(numpy_array, n_steps)
print(X.shape, y.shape)
# the dataset knows the number of features, e.g. 2
n_features = X.shape[2]
# Reduce learning rate when nothing happens to lower more the loss:
reduce_lr = ReduceLROnPlateau(monitor='loss', factor=0.9888888888888889,
patience=10, min_lr=0.0000001, verbose=1)
epochs = 10
# saving best model every epoch with ModelCheckpoint:
checkpoint_filepath = 'C:\Path\To\Saved\CheckPoint\model\'
model_checkpoint_callback = ModelCheckpoint(
filepath=checkpoint_filepath,
monitor='loss',
save_best_only=True,
save_weights_only=True,
verbose=1)
# define model
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=2, activation=LeakyReLU(), input_shape=(n_steps, n_features)))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(50, activation=LeakyReLU()))
model.add(Dense(n_features))
model.compile(optimizer=Nadam(lr=0.09), loss=SparseCategoricalCrossentropy(),
metrics=['accuracy', mean_squared_error, mean_absolute_error, mean_absolute_percentage_error])
# fit model
model.fit(X, y, epochs=10, verbose=2, callbacks=[printlr, reduce_lr, model_checkpoint_callback])
split_sequences 函数就像它的名字说的那样,它通过仅从数据库中获取 N 行作为输入并尝试从中预测所有 N+1 行来拆分数据库数据库作为输出。
但是,我认为存在问题,因为每次我尝试 运行 python 脚本时都会收到此错误:
tensorflow.python.framework.errors_impl.InvalidArgumentError: logits and labels must have the same first dimension, got logits shape [32,20] and labels shape [640]
[[node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits
(defined at C:\Users\UserName\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\backend.py:5114)
]] [Op:__inference_train_function_1228]
请问您知道如何解决这个问题吗?
提前致谢!
假设标签是整数,它们的形状不适合 SparseCategoricalCrossentropy
。检查 docs。
尝试将您的 y
转换为 one-hot 编码标签:
y = tf.keras.utils.to_categorical(y, num_classes=20)
并将损失函数更改为 CategoricalCrossentropy
:
model.compile(optimizer=Nadam(lr=0.09), loss=tf.keras.losses.CategoricalCrossentropy(),
metrics=['accuracy', mean_squared_error, mean_absolute_error, mean_absolute_percentage_error])
它应该可以工作。
我有一个 .csv 数据库文件,如下所示:
Day Hour N1 N2 N3 N4 N5 ... N14 N15 N16 N17 N18 N19 N20
0 1996-03-18 15:00 4 9 10 16 21 ... 48 62 66 68 73 76 78
1 1996-03-19 15:00 6 12 15 19 28 ... 63 64 67 69 71 75 77
2 1996-03-21 15:00 2 4 6 7 15 ... 52 54 69 72 73 75 77
3 1996-03-22 15:00 3 8 15 17 19 ... 49 60 61 64 67 68 75
4 1996-03-25 15:00 2 10 11 14 18 ... 55 60 61 66 67 75 79
... ... ... .. .. .. .. .. ... ... ... ... ... ... ... ...
13596 2022-01-04 22:50 17 18 22 26 27 ... 64 65 71 72 73 76 80
13597 2022-01-05 15:00 1 5 8 14 15 ... 47 54 59 67 70 72 76
13598 2022-01-05 22:50 6 7 14 15 16 ... 54 55 59 61 70 71 80
13599 2022-01-06 15:00 9 10 11 17 28 ... 51 55 65 67 72 76 78
13600 2022-01-06 22:50 1 2 6 9 11 ... 51 52 54 67 68 73 75
但我正在尝试通过在最后一层使用 softmax 函数和 SparseCategoricalCrossentropy()[=33= 来开发该 1D CNN 模型的修改版本] 作为损失函数,并通过向该代码添加新函数使其与众不同。
这是我目前的代码以及我正在尝试构建和使用的模型:
# multivariate output 1d cnn example
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' # or any {'0', '1', '2'}
import warnings
warnings.filterwarnings('ignore')
import pandas as pd
# multivariate output 1d cnn example
from numpy import array
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import *
from tensorflow.keras.losses import *
from tensorflow.keras.layers import *
import tensorflow as tf
from tensorflow.keras import backend as K
from tensorflow.keras.callbacks import ReduceLROnPlateau
from tensorflow.keras.callbacks import ModelCheckpoint
# Define the Required Callback Function
class printlearningrate(tf.keras.callbacks.Callback):
def on_epoch_end (self, epoch, logs={}):
optimizer = self.model.optimizer
lr = K.eval(optimizer.lr)
Epoch_count = epoch + 1
print('\n', "Epoch:", Epoch_count, ', Learning Rate: {:.7f}'.format(lr))
printlr = printlearningrate()
# split a multivariate sequence into samples
def split_sequences (sequences, n_steps):
X, y = list(), list()
for i in range(len(sequences)):
# find the end of this pattern
end_ix = i + n_steps
# check if we are beyond the dataset
if end_ix > len(sequences) - 1:
break
# gather input and output parts of the pattern
seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :]
X.append(seq_x)
y.append(seq_y)
return array(X), array(y)
df = pd.read_csv('DrawsDB.csv')
print(df)
# df['Time'] = df[['Day', 'Hour']].agg(' '.join, axis=1)
df.insert(0, 'Time', df[['Day', 'Hour']].agg(' '.join, axis=1))
df.drop(columns=['Day', 'Hour'], inplace=True)
df.set_index('Time', inplace=True)
print(df)
numpy_array = df.to_numpy()
print(type(numpy_array))
print(numpy_array)
# choose a number of time steps
n_steps = 10
# convert into input/output
X, y = split_sequences(numpy_array, n_steps)
print(X.shape, y.shape)
# the dataset knows the number of features, e.g. 2
n_features = X.shape[2]
# Reduce learning rate when nothing happens to lower more the loss:
reduce_lr = ReduceLROnPlateau(monitor='loss', factor=0.9888888888888889,
patience=10, min_lr=0.0000001, verbose=1)
epochs = 10
# saving best model every epoch with ModelCheckpoint:
checkpoint_filepath = 'C:\Path\To\Saved\CheckPoint\model\'
model_checkpoint_callback = ModelCheckpoint(
filepath=checkpoint_filepath,
monitor='loss',
save_best_only=True,
save_weights_only=True,
verbose=1)
# define model
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=2, activation=LeakyReLU(), input_shape=(n_steps, n_features)))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(50, activation=LeakyReLU()))
model.add(Dense(n_features))
model.compile(optimizer=Nadam(lr=0.09), loss=SparseCategoricalCrossentropy(),
metrics=['accuracy', mean_squared_error, mean_absolute_error, mean_absolute_percentage_error])
# fit model
model.fit(X, y, epochs=10, verbose=2, callbacks=[printlr, reduce_lr, model_checkpoint_callback])
split_sequences 函数就像它的名字说的那样,它通过仅从数据库中获取 N 行作为输入并尝试从中预测所有 N+1 行来拆分数据库数据库作为输出。
但是,我认为存在问题,因为每次我尝试 运行 python 脚本时都会收到此错误:
tensorflow.python.framework.errors_impl.InvalidArgumentError: logits and labels must have the same first dimension, got logits shape [32,20] and labels shape [640]
[[node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits
(defined at C:\Users\UserName\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\backend.py:5114)
]] [Op:__inference_train_function_1228]
请问您知道如何解决这个问题吗?
提前致谢!
假设标签是整数,它们的形状不适合 SparseCategoricalCrossentropy
。检查 docs。
尝试将您的 y
转换为 one-hot 编码标签:
y = tf.keras.utils.to_categorical(y, num_classes=20)
并将损失函数更改为 CategoricalCrossentropy
:
model.compile(optimizer=Nadam(lr=0.09), loss=tf.keras.losses.CategoricalCrossentropy(),
metrics=['accuracy', mean_squared_error, mean_absolute_error, mean_absolute_percentage_error])
它应该可以工作。