如何重塑我的输入以将其输入一维卷积层以进行序列分类?
How to reshape my input to feed it into 1D Convolutional layer for sequence classification?
我有一个包含 339732 行和两列的 csv 文件:
- 第一个是29个特征值,即X
第二个是二进制标签值,即 Y
dataframe = pd.read_csv("features.csv", header = None)
数据集 = dataframe.values
X = dataset[:, 0:29].astype(float)
Y = dataset[:,29]
X_train, y_train, X_test, y_test = train_test_split(X,Y, random_state = 42)
我正在尝试在一维卷积层上训练它:
model = Sequential()
model.add(Conv1D(64, 3, activation='relu', input_shape=(X_train.shape[0], 29)))
model.add(Conv1D(64, 3, activation='relu'))
model.add(MaxPooling1D(3))
model.add(Conv1D(128, 3, activation='relu'))
model.add(Conv1D(128, 3, activation='relu'))
model.add(GlobalAveragePooling1D())
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
model.fit(X_train, y_train, batch_size=16, epochs=2)
score = model.evaluate(X_test, y_test, batch_size=16)
因为 Conv1D 层需要 3-D 输入,所以我按如下方式转换我的输入:
X_train = np.reshape(X_train, (1, X_train.shape[0], X_train.shape[1]))
X_test = np.reshape(X_test, (1, X_test.shape[0], X_test.shape[1]))
但是,这仍然会引发错误:
ValueError: Negative dimension size caused by subtracting 3 from 1 for 'conv1d_1/convolution/Conv2D' (op: 'Conv2D') with input shapes: [?,1,1,29], [1,3,29,64].
有什么方法可以正确输入我的输入吗?
据我所知,一维卷积层接受 Batchsize x Width x Channels 形式的输入。您正在使用
重塑
X_train = np.reshape(X_train, (1, X_train.shape[0], X_train.shape[1]))
但是 X_train.shape[0]
是你的 batchsize 我 guess.I 认为问题出在某个地方。请问reshape之前X_train的形状是什么?
您必须考虑您的数据在 339732 个条目或 29 个特征之间是否存在某种递增关系,这意味着顺序是否重要。如果不是,我认为 CNN 不适合这种情况。
如果29个特征"indicates the progression of something":
X_train = X_train.reshape((X_train.shape[0], X_train.shape[1],1))
如果29个特征是独立的,那么就像图像上的通道一样,但是只用1个卷积是没有意义的。
X_train = X_train.reshape((X_train.shape[0],1, X_train.shape[1]))
如果您想在顺序很重要的块中选择 339732 条目(剪辑 339732 或添加零填充以便按时间步长整除):
X_train = X_train.reshape((int(X_train.shape[0]/timesteps),timesteps, X_train.shape[1],1))
我有一个包含 339732 行和两列的 csv 文件:
- 第一个是29个特征值,即X
第二个是二进制标签值,即 Y
dataframe = pd.read_csv("features.csv", header = None) 数据集 = dataframe.values
X = dataset[:, 0:29].astype(float) Y = dataset[:,29] X_train, y_train, X_test, y_test = train_test_split(X,Y, random_state = 42)
我正在尝试在一维卷积层上训练它:
model = Sequential()
model.add(Conv1D(64, 3, activation='relu', input_shape=(X_train.shape[0], 29)))
model.add(Conv1D(64, 3, activation='relu'))
model.add(MaxPooling1D(3))
model.add(Conv1D(128, 3, activation='relu'))
model.add(Conv1D(128, 3, activation='relu'))
model.add(GlobalAveragePooling1D())
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
model.fit(X_train, y_train, batch_size=16, epochs=2)
score = model.evaluate(X_test, y_test, batch_size=16)
因为 Conv1D 层需要 3-D 输入,所以我按如下方式转换我的输入:
X_train = np.reshape(X_train, (1, X_train.shape[0], X_train.shape[1]))
X_test = np.reshape(X_test, (1, X_test.shape[0], X_test.shape[1]))
但是,这仍然会引发错误:
ValueError: Negative dimension size caused by subtracting 3 from 1 for 'conv1d_1/convolution/Conv2D' (op: 'Conv2D') with input shapes: [?,1,1,29], [1,3,29,64].
有什么方法可以正确输入我的输入吗?
据我所知,一维卷积层接受 Batchsize x Width x Channels 形式的输入。您正在使用
重塑X_train = np.reshape(X_train, (1, X_train.shape[0], X_train.shape[1]))
但是 X_train.shape[0]
是你的 batchsize 我 guess.I 认为问题出在某个地方。请问reshape之前X_train的形状是什么?
您必须考虑您的数据在 339732 个条目或 29 个特征之间是否存在某种递增关系,这意味着顺序是否重要。如果不是,我认为 CNN 不适合这种情况。
如果29个特征"indicates the progression of something":
X_train = X_train.reshape((X_train.shape[0], X_train.shape[1],1))
如果29个特征是独立的,那么就像图像上的通道一样,但是只用1个卷积是没有意义的。
X_train = X_train.reshape((X_train.shape[0],1, X_train.shape[1]))
如果您想在顺序很重要的块中选择 339732 条目(剪辑 339732 或添加零填充以便按时间步长整除):
X_train = X_train.reshape((int(X_train.shape[0]/timesteps),timesteps, X_train.shape[1],1))