LSTM 后跟 Mean Pooling
LSTM Followed by Mean Pooling
我正在使用 Keras 1.0。我的问题与这个相同(),但那里的答案对我来说似乎不够。
我想实现这个网络:
以下代码不起作用:
sequence = Input(shape=(max_sent_len,), dtype='int32')
embedded = Embedding(vocab_size, word_embedding_size)(sequence)
lstm = LSTM(hidden_state_size, activation='sigmoid', inner_activation='hard_sigmoid', return_sequences=True)(embedded)
pool = AveragePooling1D()(lstm)
output = Dense(1, activation='sigmoid')(pool)
如果我不设置 return_sequences=True
,调用 AveragePooling1D()
时会出现此错误:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/PATH/keras/engine/topology.py", line 462, in __call__
self.assert_input_compatibility(x)
File "/PATH/keras/engine/topology.py", line 382, in assert_input_compatibility
str(K.ndim(x)))
Exception: ('Input 0 is incompatible with layer averagepooling1d_6: expected ndim=3', ' found ndim=2')
否则,当我调用 Dense()
:
时会出现此错误
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/PATH/keras/engine/topology.py", line 456, in __call__
self.build(input_shapes[0])
File "/fs/clip-arqat/mossaab/trec/liveqa/cmu/venv/lib/python2.7/site-packages/keras/layers/core.py", line 512, in build
assert len(input_shape) == 2
AssertionError
添加 TimeDistributed(Dense(1))
帮助:
sequence = Input(shape=(max_sent_len,), dtype='int32')
embedded = Embedding(vocab_size, word_embedding_size)(sequence)
lstm = LSTM(hidden_state_size, activation='sigmoid', inner_activation='hard_sigmoid', return_sequences=True)(embedded)
distributed = TimeDistributed(Dense(1))(lstm)
pool = AveragePooling1D()(distributed)
output = Dense(1, activation='sigmoid')(pool)
我认为接受的答案基本上是错误的。在以下位置找到了解决方案:
https://github.com/fchollet/keras/issues/2151
但是,它只适用于 theano 后端。我修改了代码,使其同时支持theano和tensorflow。
from keras.engine.topology import Layer, InputSpec
from keras import backend as T
class TemporalMeanPooling(Layer):
"""
This is a custom Keras layer. This pooling layer accepts the temporal
sequence output by a recurrent layer and performs temporal pooling,
looking at only the non-masked portion of the sequence. The pooling
layer converts the entire variable-length hidden vector sequence
into a single hidden vector, and then feeds its output to the Dense
layer.
input shape: (nb_samples, nb_timesteps, nb_features)
output shape: (nb_samples, nb_features)
"""
def __init__(self, **kwargs):
super(TemporalMeanPooling, self).__init__(**kwargs)
self.supports_masking = True
self.input_spec = [InputSpec(ndim=3)]
def get_output_shape_for(self, input_shape):
return (input_shape[0], input_shape[2])
def call(self, x, mask=None): #mask: (nb_samples, nb_timesteps)
if mask is None:
mask = T.mean(T.ones_like(x), axis=-1)
ssum = T.sum(x,axis=-2) #(nb_samples, np_features)
mask = T.cast(mask,T.floatx())
rcnt = T.sum(mask,axis=-1,keepdims=True) #(nb_samples)
return ssum/rcnt
#return rcnt
def compute_mask(self, input, mask):
return None
谢谢,我也遇到了这个问题,但我认为 TimeDistributed 层不能按你想要的那样工作,你可以试试 Luke Guye 的 TemporalMeanPooling 层,它对我有用。这是示例:
sequence = Input(shape=(max_sent_len,), dtype='int32')
embedded = Embedding(vocab_size, word_embedding_size)(sequence)
lstm = LSTM(hidden_state_size, return_sequences=True)(embedded)
pool = TemporalMeanPooling()(lstm)
output = Dense(1, activation='sigmoid')(pool)
我只是尝试实现与原始发布者相同的模型,并且我正在使用 Keras 2.0.3
。当我使用 GlobalAveragePooling1D
时,LSTM 之后的均值池化起作用了,只需确保 LSTM 层中的 return_sequences=True
即可。试一试!
派对已经很晚了,但是 tf.keras.layers.AveragePooling1D
使用合适的 pool_size
参数似乎也 return 了正确的结果。
处理 bobchennan on this issue 共享的示例。
# create sample data
A=np.array([[1,2,3],[4,5,6],[0,0,0],[0,0,0],[0,0,0]])
B=np.array([[1,3,0],[4,0,0],[0,0,1],[0,0,0],[0,0,0]])
C=np.array([A,B]).astype("float32")
# expected answer (for temporal mean)
np.mean(C, axis=1)
输出为
array([[1. , 1.4, 1.8],
[1. , 0.6, 0.2]], dtype=float32)
现在使用 AveragePooling1D
,
model = keras.models.Sequential(
tf.keras.layers.AveragePooling1D(pool_size=5)
)
model.predict(C)
输出是,
array([[[1. , 1.4, 1.8]],
[[1. , 0.6, 0.2]]], dtype=float32)
需要考虑的几点,
pool_size
应该等于循环层的 step/timesteps 大小。
- 输出的形状是
(batch_size, downsampled_steps, features)
,其中包含一个额外的 downsampled_steps
维度。如果您将 pool_size
设置为等于循环层中的时间步长,这将始终为 1。
我正在使用 Keras 1.0。我的问题与这个相同(
我想实现这个网络:
以下代码不起作用:
sequence = Input(shape=(max_sent_len,), dtype='int32')
embedded = Embedding(vocab_size, word_embedding_size)(sequence)
lstm = LSTM(hidden_state_size, activation='sigmoid', inner_activation='hard_sigmoid', return_sequences=True)(embedded)
pool = AveragePooling1D()(lstm)
output = Dense(1, activation='sigmoid')(pool)
如果我不设置 return_sequences=True
,调用 AveragePooling1D()
时会出现此错误:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/PATH/keras/engine/topology.py", line 462, in __call__
self.assert_input_compatibility(x)
File "/PATH/keras/engine/topology.py", line 382, in assert_input_compatibility
str(K.ndim(x)))
Exception: ('Input 0 is incompatible with layer averagepooling1d_6: expected ndim=3', ' found ndim=2')
否则,当我调用 Dense()
:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/PATH/keras/engine/topology.py", line 456, in __call__
self.build(input_shapes[0])
File "/fs/clip-arqat/mossaab/trec/liveqa/cmu/venv/lib/python2.7/site-packages/keras/layers/core.py", line 512, in build
assert len(input_shape) == 2
AssertionError
添加 TimeDistributed(Dense(1))
帮助:
sequence = Input(shape=(max_sent_len,), dtype='int32')
embedded = Embedding(vocab_size, word_embedding_size)(sequence)
lstm = LSTM(hidden_state_size, activation='sigmoid', inner_activation='hard_sigmoid', return_sequences=True)(embedded)
distributed = TimeDistributed(Dense(1))(lstm)
pool = AveragePooling1D()(distributed)
output = Dense(1, activation='sigmoid')(pool)
我认为接受的答案基本上是错误的。在以下位置找到了解决方案: https://github.com/fchollet/keras/issues/2151 但是,它只适用于 theano 后端。我修改了代码,使其同时支持theano和tensorflow。
from keras.engine.topology import Layer, InputSpec
from keras import backend as T
class TemporalMeanPooling(Layer):
"""
This is a custom Keras layer. This pooling layer accepts the temporal
sequence output by a recurrent layer and performs temporal pooling,
looking at only the non-masked portion of the sequence. The pooling
layer converts the entire variable-length hidden vector sequence
into a single hidden vector, and then feeds its output to the Dense
layer.
input shape: (nb_samples, nb_timesteps, nb_features)
output shape: (nb_samples, nb_features)
"""
def __init__(self, **kwargs):
super(TemporalMeanPooling, self).__init__(**kwargs)
self.supports_masking = True
self.input_spec = [InputSpec(ndim=3)]
def get_output_shape_for(self, input_shape):
return (input_shape[0], input_shape[2])
def call(self, x, mask=None): #mask: (nb_samples, nb_timesteps)
if mask is None:
mask = T.mean(T.ones_like(x), axis=-1)
ssum = T.sum(x,axis=-2) #(nb_samples, np_features)
mask = T.cast(mask,T.floatx())
rcnt = T.sum(mask,axis=-1,keepdims=True) #(nb_samples)
return ssum/rcnt
#return rcnt
def compute_mask(self, input, mask):
return None
谢谢,我也遇到了这个问题,但我认为 TimeDistributed 层不能按你想要的那样工作,你可以试试 Luke Guye 的 TemporalMeanPooling 层,它对我有用。这是示例:
sequence = Input(shape=(max_sent_len,), dtype='int32')
embedded = Embedding(vocab_size, word_embedding_size)(sequence)
lstm = LSTM(hidden_state_size, return_sequences=True)(embedded)
pool = TemporalMeanPooling()(lstm)
output = Dense(1, activation='sigmoid')(pool)
我只是尝试实现与原始发布者相同的模型,并且我正在使用 Keras 2.0.3
。当我使用 GlobalAveragePooling1D
时,LSTM 之后的均值池化起作用了,只需确保 LSTM 层中的 return_sequences=True
即可。试一试!
派对已经很晚了,但是 tf.keras.layers.AveragePooling1D
使用合适的 pool_size
参数似乎也 return 了正确的结果。
处理 bobchennan on this issue 共享的示例。
# create sample data
A=np.array([[1,2,3],[4,5,6],[0,0,0],[0,0,0],[0,0,0]])
B=np.array([[1,3,0],[4,0,0],[0,0,1],[0,0,0],[0,0,0]])
C=np.array([A,B]).astype("float32")
# expected answer (for temporal mean)
np.mean(C, axis=1)
输出为
array([[1. , 1.4, 1.8],
[1. , 0.6, 0.2]], dtype=float32)
现在使用 AveragePooling1D
,
model = keras.models.Sequential(
tf.keras.layers.AveragePooling1D(pool_size=5)
)
model.predict(C)
输出是,
array([[[1. , 1.4, 1.8]],
[[1. , 0.6, 0.2]]], dtype=float32)
需要考虑的几点,
pool_size
应该等于循环层的 step/timesteps 大小。- 输出的形状是
(batch_size, downsampled_steps, features)
,其中包含一个额外的downsampled_steps
维度。如果您将pool_size
设置为等于循环层中的时间步长,这将始终为 1。