训练 FF 神经语言模型
Training a FF Neural Language Model
考虑句子“The cat is upstairs”的 3-grams,其中每个单词由其他单词用 @ 和 ~ 符号分隔。
trigrams = ['@th', 'the', 'he~', '@ca', 'cat', 'at~', '@is', 'is~',
'@up', 'ups', 'pst', 'sta', 'tai', 'air', 'irs', 'rs~']
我想使用这句话训练基于字符的前馈神经语言模型,但我无法正确拟合 X 和 y 参数。
我的代码如下:
# trigrams encoded
d = dict([(y,x+1) for x,y in enumerate(sorted(set(trigrams)))])
trigrams_encoded = [d[x] for x in trigrams]
# trigrams_encoded = [3, 15, 8, 1, 7, 6, 2, 10, 4, 16, 11, 13, 14, 5, 9, 12]
# x_train
x_train = [] # list of lists, each list contains 3 encoded trigrams
for i in range(len(trigrams_encoded)-3) :
lst = trigrams_encoded[i:i+3]
x_train.append(lst)
x_train = np.array(x_train) # x_train shape is (13,3)
# y_train
y_train = trigrams_encoded[3:]
data = np.array(y_train)
y_onehot = to_categorical(data) # y_onehot shape is (13,17)
y_onehot = np.delete(y_onehot, 0, 1) # now shape is (13,16)
# define model
model = Sequential()
model.add(Embedding(len(d), 10, input_length=3)) #len(d) = 16
model.add(Flatten())
model.add(Dense(10, activation='relu'))
model.add(Dense(len(d), activation='softmax'))
# compile the model
# i have set sparse_categorical_crossentropy here, but not sure if this is correct. feel free to change it
model.compile(loss="sparse_categorical_crossentropy", optimizer='adam', metrics=['accuracy'])
# train the model
model.fit(x_train, y_onehot, epochs=1, verbose=0)
我最初的尝试是说,由于 input_length=3,该模型将采用所列 n-gram 的输入三元组,这些 n-gram 应标记为列表中的下一个 n-gram。但这似乎失败了。 (它应该失败吗?)
上面的代码引发了以下错误,我不知道如何解决:
"InvalidArgumentError: Graph execution error:
Detected at node 'sequential/embedding/embedding_lookup' defined at (most recent call last):
(... many lines...)
Node: 'sequential/embedding/embedding_lookup'
indices[5,1] = 16 is not in [0, 16)"
能否请您协助正确选择此处的 X 和 y?
当使用 categorical_crossentropy
作为损失函数时,您的代码运行良好,因为您使用的是 one-hot 编码标签:
import numpy as np
import tensorflow as tf
trigrams = ['@th', 'the', 'he~', '@ca', 'cat', 'at~', '@is', 'is~',
'@up', 'ups', 'pst', 'sta', 'tai', 'air', 'irs', 'rs~']
# trigrams encoded
d = dict([(y,x+1) for x,y in enumerate(sorted(set(trigrams)))])
trigrams_encoded = [d[x] for x in trigrams]
# trigrams_encoded = [3, 15, 8, 1, 7, 6, 2, 10, 4, 16, 11, 13, 14, 5, 9, 12]
# x_train
x_train = [] # list of lists, each list contains 3 encoded trigrams
for i in range(len(trigrams_encoded)-3) :
lst = trigrams_encoded[i:i+3]
x_train.append(lst)
x_train = np.array(x_train) # x_train shape is (13,3)
# y_train
y_train = trigrams_encoded[3:]
data = np.array(y_train)
y_onehot = tf.keras.utils.to_categorical(data) # y_onehot shape is (13,17)
y_onehot = np.delete(y_onehot, 0, 1) # now shape is (13,16)
# define model
model = tf.keras.Sequential()
model.add(tf.keras.layers.Embedding(len(d) + 1, 10, input_length=3)) #len(d) = 16
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(10, activation='relu'))
model.add(tf.keras.layers.Dense(len(d), activation='softmax'))
model.compile(loss="categorical_crossentropy", optimizer='adam', metrics=['accuracy'])
# train the model
model.fit(x_train, y_onehot, epochs=5, verbose=1)
sparse_categorical_crossentropy
仅适用于稀疏整数值。
考虑句子“The cat is upstairs”的 3-grams,其中每个单词由其他单词用 @ 和 ~ 符号分隔。
trigrams = ['@th', 'the', 'he~', '@ca', 'cat', 'at~', '@is', 'is~',
'@up', 'ups', 'pst', 'sta', 'tai', 'air', 'irs', 'rs~']
我想使用这句话训练基于字符的前馈神经语言模型,但我无法正确拟合 X 和 y 参数。
我的代码如下:
# trigrams encoded
d = dict([(y,x+1) for x,y in enumerate(sorted(set(trigrams)))])
trigrams_encoded = [d[x] for x in trigrams]
# trigrams_encoded = [3, 15, 8, 1, 7, 6, 2, 10, 4, 16, 11, 13, 14, 5, 9, 12]
# x_train
x_train = [] # list of lists, each list contains 3 encoded trigrams
for i in range(len(trigrams_encoded)-3) :
lst = trigrams_encoded[i:i+3]
x_train.append(lst)
x_train = np.array(x_train) # x_train shape is (13,3)
# y_train
y_train = trigrams_encoded[3:]
data = np.array(y_train)
y_onehot = to_categorical(data) # y_onehot shape is (13,17)
y_onehot = np.delete(y_onehot, 0, 1) # now shape is (13,16)
# define model
model = Sequential()
model.add(Embedding(len(d), 10, input_length=3)) #len(d) = 16
model.add(Flatten())
model.add(Dense(10, activation='relu'))
model.add(Dense(len(d), activation='softmax'))
# compile the model
# i have set sparse_categorical_crossentropy here, but not sure if this is correct. feel free to change it
model.compile(loss="sparse_categorical_crossentropy", optimizer='adam', metrics=['accuracy'])
# train the model
model.fit(x_train, y_onehot, epochs=1, verbose=0)
我最初的尝试是说,由于 input_length=3,该模型将采用所列 n-gram 的输入三元组,这些 n-gram 应标记为列表中的下一个 n-gram。但这似乎失败了。 (它应该失败吗?)
上面的代码引发了以下错误,我不知道如何解决:
"InvalidArgumentError: Graph execution error:
Detected at node 'sequential/embedding/embedding_lookup' defined at (most recent call last):
(... many lines...)
Node: 'sequential/embedding/embedding_lookup'
indices[5,1] = 16 is not in [0, 16)"
能否请您协助正确选择此处的 X 和 y?
当使用 categorical_crossentropy
作为损失函数时,您的代码运行良好,因为您使用的是 one-hot 编码标签:
import numpy as np
import tensorflow as tf
trigrams = ['@th', 'the', 'he~', '@ca', 'cat', 'at~', '@is', 'is~',
'@up', 'ups', 'pst', 'sta', 'tai', 'air', 'irs', 'rs~']
# trigrams encoded
d = dict([(y,x+1) for x,y in enumerate(sorted(set(trigrams)))])
trigrams_encoded = [d[x] for x in trigrams]
# trigrams_encoded = [3, 15, 8, 1, 7, 6, 2, 10, 4, 16, 11, 13, 14, 5, 9, 12]
# x_train
x_train = [] # list of lists, each list contains 3 encoded trigrams
for i in range(len(trigrams_encoded)-3) :
lst = trigrams_encoded[i:i+3]
x_train.append(lst)
x_train = np.array(x_train) # x_train shape is (13,3)
# y_train
y_train = trigrams_encoded[3:]
data = np.array(y_train)
y_onehot = tf.keras.utils.to_categorical(data) # y_onehot shape is (13,17)
y_onehot = np.delete(y_onehot, 0, 1) # now shape is (13,16)
# define model
model = tf.keras.Sequential()
model.add(tf.keras.layers.Embedding(len(d) + 1, 10, input_length=3)) #len(d) = 16
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(10, activation='relu'))
model.add(tf.keras.layers.Dense(len(d), activation='softmax'))
model.compile(loss="categorical_crossentropy", optimizer='adam', metrics=['accuracy'])
# train the model
model.fit(x_train, y_onehot, epochs=5, verbose=1)
sparse_categorical_crossentropy
仅适用于稀疏整数值。