Keras 中的 NN - 预期 dense_2 具有 3 个维度,但得到形状为 (10980, 3) 的数组
NN in Keras - expected dense_2 to have 3 dimensions, but got array with shape (10980, 3)
我想使用词嵌入[=51=为多分类情感分析训练一个中性网络 ] 对于推文。
这是我的代码:
import pandas as pd
import numpy as np
import re
from nltk.corpus import stopwords
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from tensorflow.python.keras.preprocessing.text import Tokenizer
from tensorflow.python.keras.preprocessing.sequence import pad_sequences
from keras.models import Sequential
from keras.layers import Dense, Embedding, LSTM, GRU
from keras.layers.embeddings import Embedding
from keras.wrappers.scikit_learn import KerasClassifier
from keras.utils import np_utils
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.pipeline import Pipeline
导入数据
df = pd.DataFrame()
df = pd.read_csv('Tweets.csv', encoding='utf-8')
清理推文
def remove_mentions(input_text):
return re.sub(r'@\w+', '', input_text)
def remove_stopwords(input_text):
stopwords_list = stopwords.words('english')
whitelist = ["n't", "not", "no"]
words = input_text.split()
clean_words = [word for word in words if (word not in stopwords_list or word in whitelist) and len(word) > 1]
return " ".join(clean_words)
df.text = df.text.apply(remove_stopwords).apply(remove_mentions)
df.text = [tweet for tweet in df.text if type(tweet) is str]
X = df['text']
y = df['airline_sentiment']
将我的数据拆分为训练和测试
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.25, random_state=37)
One-Hot Encode 字段“Sentiment”
最初标签是字符串类型:'neutral'、'positive'、'negative'。所以我首先将它们转换为整数,然后应用单热编码:
le = LabelEncoder()
y_train_num = le.fit_transform(y_train.values)
y_test_num = le.fit_transform(y_test.values)
nb_classes = 3
y_train = np_utils.to_categorical(y_train_num, nb_classes)
y_test = np_utils.to_categorical(y_test_num, nb_classes)
准备词嵌入
tokenizer_obj = Tokenizer()
tokenizer_obj.fit_on_texts(X)
max_length = max([len(tweet.split()) for tweet in X])
print("max_length=%s" % (max_length))
vocab_size = len(tokenizer_obj.word_index) + 1
print("vocab_size=%s" % (vocab_size))
X_train_tokenized = tokenizer_obj.texts_to_sequences(X_train)
X_test_tokenized = tokenizer_obj.texts_to_sequences(X_test)
X_train_pad = pad_sequences(X_train_tokenized, maxlen=max_length, padding='post')
X_test_pad = pad_sequences(X_test_tokenized, maxlen=max_length, padding='post')
定义并应用我的神经网络模型
EMBEDDING_DIM = 100
model = Sequential()
model.add(Embedding(vocab_size, EMBEDDING_DIM, input_length=max_length))
model.add(Dense(8, input_dim=4, activation='relu'))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())
model.fit(X_train_pad, y_train, batch_size=128, epochs=25, validation_data=(X_test_pad, y_test), verbose=2)
我选择最后一层有 3 个输出单元的原因是因为它是一个多分类任务,我有 3 个 类。
这是模型摘要:
Layer (type) Output Shape Param #
=================================================================
embedding_1 (Embedding) (None, 23, 100) 1488200
_________________________________________________________________
dense_1 (Dense) (None, 23, 8) 808
_________________________________________________________________
dense_2 (Dense) (None, 23, 3) 27
=================================================================
Total params: 1,489,035
Trainable params: 1,489,035
Non-trainable params: 0
_________________________________________________________________
当代码到达 model.fit()
时,出现以下错误:
ValueError: Error when checking target: expected dense_2 to have 3 dimensions, but got array with shape (10980, 3)
我做错了什么?
正如您在 model.summary()
的输出中看到的,模型输出形状是 (None, 23, 3)
,而您希望它是 (None, 3)
。发生这种情况是因为 并且不会自动展平其输入(如果它具有超过 2 个维度)。因此,解决此问题的一种方法是在 Embedding
层之后使用 Flatten
层:
model.add(Embedding(vocab_size, EMBEDDING_DIM, input_length=max_length))
model.add(Flatten())
这样 Embedding
层的输出将被展平,随后的密集层将具有 2D 输出。
作为奖励 (!),如果您在 Embedding
层之后使用 LSTM
层,您可能可以获得更好的准确性:
model.add(Embedding(vocab_size, EMBEDDING_DIM, input_length=max_length))
model.add(LSTM(32))
model.add(Dense(8, input_dim=4, activation='relu'))
model.add(Dense(3, activation='softmax'))
但是,这并不能保证。您必须正确地试验和调整您的模型。
如之前的回答所述,我也建议使用 LSTM 层。试试这个。
EMBEDDING_DIM = 100
model = Sequential()
model.add(Embedding(vocab_size, EMBEDDING_DIM, input_length=max_length))
model.add(LSTM(32))
model.add(Dense(8, activation='relu'))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())
model.fit(X_train_pad, y_train, batch_size=128, epochs=25, validation_data=(X_test_pad, y_test), verbose=2)
并且对于隐藏层,我们不需要在 Keras.Sequential() 中指定 input_shpae 或 input_dim,是的,与普通的密集层相比,LSTM 的训练速度会非常慢,但值得时间.
我想使用词嵌入[=51=为多分类情感分析训练一个中性网络 ] 对于推文。
这是我的代码:
import pandas as pd
import numpy as np
import re
from nltk.corpus import stopwords
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from tensorflow.python.keras.preprocessing.text import Tokenizer
from tensorflow.python.keras.preprocessing.sequence import pad_sequences
from keras.models import Sequential
from keras.layers import Dense, Embedding, LSTM, GRU
from keras.layers.embeddings import Embedding
from keras.wrappers.scikit_learn import KerasClassifier
from keras.utils import np_utils
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.pipeline import Pipeline
导入数据
df = pd.DataFrame()
df = pd.read_csv('Tweets.csv', encoding='utf-8')
清理推文
def remove_mentions(input_text):
return re.sub(r'@\w+', '', input_text)
def remove_stopwords(input_text):
stopwords_list = stopwords.words('english')
whitelist = ["n't", "not", "no"]
words = input_text.split()
clean_words = [word for word in words if (word not in stopwords_list or word in whitelist) and len(word) > 1]
return " ".join(clean_words)
df.text = df.text.apply(remove_stopwords).apply(remove_mentions)
df.text = [tweet for tweet in df.text if type(tweet) is str]
X = df['text']
y = df['airline_sentiment']
将我的数据拆分为训练和测试
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.25, random_state=37)
One-Hot Encode 字段“Sentiment”
最初标签是字符串类型:'neutral'、'positive'、'negative'。所以我首先将它们转换为整数,然后应用单热编码:
le = LabelEncoder()
y_train_num = le.fit_transform(y_train.values)
y_test_num = le.fit_transform(y_test.values)
nb_classes = 3
y_train = np_utils.to_categorical(y_train_num, nb_classes)
y_test = np_utils.to_categorical(y_test_num, nb_classes)
准备词嵌入
tokenizer_obj = Tokenizer()
tokenizer_obj.fit_on_texts(X)
max_length = max([len(tweet.split()) for tweet in X])
print("max_length=%s" % (max_length))
vocab_size = len(tokenizer_obj.word_index) + 1
print("vocab_size=%s" % (vocab_size))
X_train_tokenized = tokenizer_obj.texts_to_sequences(X_train)
X_test_tokenized = tokenizer_obj.texts_to_sequences(X_test)
X_train_pad = pad_sequences(X_train_tokenized, maxlen=max_length, padding='post')
X_test_pad = pad_sequences(X_test_tokenized, maxlen=max_length, padding='post')
定义并应用我的神经网络模型
EMBEDDING_DIM = 100
model = Sequential()
model.add(Embedding(vocab_size, EMBEDDING_DIM, input_length=max_length))
model.add(Dense(8, input_dim=4, activation='relu'))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())
model.fit(X_train_pad, y_train, batch_size=128, epochs=25, validation_data=(X_test_pad, y_test), verbose=2)
我选择最后一层有 3 个输出单元的原因是因为它是一个多分类任务,我有 3 个 类。
这是模型摘要:
Layer (type) Output Shape Param #
=================================================================
embedding_1 (Embedding) (None, 23, 100) 1488200
_________________________________________________________________
dense_1 (Dense) (None, 23, 8) 808
_________________________________________________________________
dense_2 (Dense) (None, 23, 3) 27
=================================================================
Total params: 1,489,035
Trainable params: 1,489,035
Non-trainable params: 0
_________________________________________________________________
当代码到达 model.fit()
时,出现以下错误:
ValueError: Error when checking target: expected dense_2 to have 3 dimensions, but got array with shape (10980, 3)
我做错了什么?
正如您在 model.summary()
的输出中看到的,模型输出形状是 (None, 23, 3)
,而您希望它是 (None, 3)
。发生这种情况是因为 Embedding
层之后使用 Flatten
层:
model.add(Embedding(vocab_size, EMBEDDING_DIM, input_length=max_length))
model.add(Flatten())
这样 Embedding
层的输出将被展平,随后的密集层将具有 2D 输出。
作为奖励 (!),如果您在 Embedding
层之后使用 LSTM
层,您可能可以获得更好的准确性:
model.add(Embedding(vocab_size, EMBEDDING_DIM, input_length=max_length))
model.add(LSTM(32))
model.add(Dense(8, input_dim=4, activation='relu'))
model.add(Dense(3, activation='softmax'))
但是,这并不能保证。您必须正确地试验和调整您的模型。
如之前的回答所述,我也建议使用 LSTM 层。试试这个。
EMBEDDING_DIM = 100
model = Sequential()
model.add(Embedding(vocab_size, EMBEDDING_DIM, input_length=max_length))
model.add(LSTM(32))
model.add(Dense(8, activation='relu'))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())
model.fit(X_train_pad, y_train, batch_size=128, epochs=25, validation_data=(X_test_pad, y_test), verbose=2)
并且对于隐藏层,我们不需要在 Keras.Sequential() 中指定 input_shpae 或 input_dim,是的,与普通的密集层相比,LSTM 的训练速度会非常慢,但值得时间.