Keras 1D CNN:如何正确指定维度?

Keras 1D CNN: How to specify dimension correctly?

所以,我要做的是使用获得的开普勒数据 here 对系外行星和非系外行星进行分类。数据类型为时间序列,维度为(num_of_samples,3197)。我发现这可以通过在 Keras 中使用一维卷积层来完成。但是我一直弄乱尺寸并得到以下错误

Error when checking model input: expected conv1d_1_input to have shape (None, 3197, 1) but got array with shape (1, 570, 3197)

所以,问题是:

1.Does数据(training_set和test_set)需要转换成3D张量?如果是,正确的尺寸是多少?

2.What 是正确的输入形状吗?我知道我有 1 个功能的 3197 个时间步,但 the documentation 没有指定他们是使用 TF 还是 theano 后端,所以我仍然很头疼。

顺便说一下,我正在使用 TF 后端。非常感谢任何帮助!谢谢!

"""
Created on Wed May 17 18:23:31 2017

@author: Amajid Sinar
"""

import matplotlib.pyplot as plt
import pandas as pd
plt.style.use("ggplot")
import numpy as np

#Importing training set
training_set = pd.read_csv("exoTrain.csv")
X_train = training_set.iloc[:,1:].values
y_train = training_set.iloc[:,0:1].values

#Importing test set
test_set = pd.read_csv("exoTest.csv")
X_test = test_set.iloc[:,1:].values
y_test = test_set.iloc[:,0:1].values

#Scale the data
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.fit_transform(X_test)

#Convert data into 3d tensor
X_train = np.reshape(X_train,(1,X_train.shape[0],X_train.shape[1]))
X_test = np.reshape(X_test,(1,X_test.shape[0],X_test.shape[1]))


#Importing convolutional layers
from keras.models import Sequential
from keras.layers import Convolution1D
from keras.layers import MaxPooling1D
from keras.layers import Flatten
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers.normalization import BatchNormalization

#Convolution steps
#1.Convolution
#2.Max Pooling
#3.Flattening
#4.Full Connection

#Initialising the CNN
classifier = Sequential()

#Input shape must be explicitly defined, DO NOT USE (None,shape)!!!
#1.Multiple convolution and max pooling
classifier.add(Convolution1D(filters=8, kernel_size=11, activation="relu", input_shape=(3197,1)))
classifier.add(MaxPooling1D(strides=4))
classifier.add(BatchNormalization())
classifier.add(Convolution1D(filters=16, kernel_size=11, activation='relu'))
classifier.add(MaxPooling1D(strides=4))
classifier.add(BatchNormalization())
classifier.add(Convolution1D(filters=32, kernel_size=11, activation='relu'))
classifier.add(MaxPooling1D(strides=4))
classifier.add(BatchNormalization())
#classifier.add(Convolution1D(filters=64, kernel_size=11, activation='relu'))
#classifier.add(MaxPooling1D(strides=4))


#2.Flattening
classifier.add(Flatten())


#3.Full Connection
classifier.add(Dropout(0.5))
classifier.add(Dense(64, activation='relu'))
classifier.add(Dropout(0.25))
classifier.add(Dense(64, activation='relu'))
classifier.add(Dense(1, activation='sigmoid'))

#Configure the learning process
classifier.compile(optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"])

#Train!
classifier.fit_generator(X_train, steps_per_epoch=X_train.shape[0], epochs=1, validation_data=(X_test,y_test))

score = classifier.evaluate(X_test, y_test)
  1. 是的,您的数据集应该是 3d 张量。

  2. 正确的输入形状(对于 tensorflow 后端)是 (sample_number,sample_size,channel_number)。您可以从错误消息 "the expected dimension was (None, 3197, 1)" 中进行检查。

'None'指的是任意大小的维度,因为它是预期训练中使用的样本数。

所以在你的情况下正确的形状是 (570, 3197, 1).

如果你碰巧使用 theano 后端,你应该把你的频道维度放在第一位: (sample_number,channel_number,sample_size) 或者在您的特定情况下

(570,1, 3197)

假设你的数据的形状是,

>>> data.shape()
(m, n)

因此,您应该添加一个新轴作为 channel axis,

>>> data = data[..., np.newaxis]
>>> data.shape()
(m, n, 1)