如何为多对一二元分类 LSTM 准备数据?
How to prepare data for a many to one binary classification LSTM?
我有 38,000 名不同患者的时间序列数据集,其中包含他们 48 小时的生理数据和 30 个特征,因此每个患者都有 48 行(每小时)和最后的二元结果(0/1)仅第 48 小时,总训练集为 (38,000*48 = 1,824,000)
行。
根据我的理解,这是一个 Many-to-one LSTM binary classification
,所以我的输入形状应该是 (38,000,48,30) (sample_size, time_steps, features)
并且 return_sequence 应该设置为 False 以 return 最后一个输出只有隐藏的神经元?
有人可以回顾一下我对此的理解吗?
谢谢。
是的,你基本上是对的:
- 输入的形状 =
(patients, 48, 30)
- 目标形状 =
(patients, 1)
您应该在 last LSTM 层中使用 return_sequences=False
。 (如果在最后一个 LSTM 之前有更多的循环层,请在其中保留 return_sequences=True
)
是的,大多数情况下您是在正确的轨道上。请参阅下面的代码以更好地理解这一点。
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Bidirectional
from keras.metrics import binary_crossentropy
# vocab size
total_features = 30
no_of_pateints = 38,000
time_steps = 48
model = Sequential()
# you can also use Bidirectional layer to speed up the learning and reduce
# training time and here you can keep return_sequence as true
# model.add(
Bidirectional(LSTM(
units=100,
input_shape=(no_of_patients, time_steps, total_features),
return_sequences=True
)))
# return_sequence should be False if there is only one LSTM layer. Otherwise in case of multiple layers,
the last layers should have return_sequence as False
model.add(LSTM(
units=100,
input_shape=(no_of_patients, time_steps, total_features),
return_sequences=False
))
model.add(Dense(2, activation='softmax'))
model.compile(
loss=binary_crossentropy,
optimizer='rmsprop',
metrics=['accuracy']
)
如果您对以上代码有任何困惑或者需要更多解释,请告诉我
我有 38,000 名不同患者的时间序列数据集,其中包含他们 48 小时的生理数据和 30 个特征,因此每个患者都有 48 行(每小时)和最后的二元结果(0/1)仅第 48 小时,总训练集为 (38,000*48 = 1,824,000)
行。
根据我的理解,这是一个 Many-to-one LSTM binary classification
,所以我的输入形状应该是 (38,000,48,30) (sample_size, time_steps, features)
并且 return_sequence 应该设置为 False 以 return 最后一个输出只有隐藏的神经元?
有人可以回顾一下我对此的理解吗?
谢谢。
是的,你基本上是对的:
- 输入的形状 =
(patients, 48, 30)
- 目标形状 =
(patients, 1)
您应该在 last LSTM 层中使用 return_sequences=False
。 (如果在最后一个 LSTM 之前有更多的循环层,请在其中保留 return_sequences=True
)
是的,大多数情况下您是在正确的轨道上。请参阅下面的代码以更好地理解这一点。
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Bidirectional
from keras.metrics import binary_crossentropy
# vocab size
total_features = 30
no_of_pateints = 38,000
time_steps = 48
model = Sequential()
# you can also use Bidirectional layer to speed up the learning and reduce
# training time and here you can keep return_sequence as true
# model.add(
Bidirectional(LSTM(
units=100,
input_shape=(no_of_patients, time_steps, total_features),
return_sequences=True
)))
# return_sequence should be False if there is only one LSTM layer. Otherwise in case of multiple layers,
the last layers should have return_sequence as False
model.add(LSTM(
units=100,
input_shape=(no_of_patients, time_steps, total_features),
return_sequences=False
))
model.add(Dense(2, activation='softmax'))
model.compile(
loss=binary_crossentropy,
optimizer='rmsprop',
metrics=['accuracy']
)
如果您对以上代码有任何困惑或者需要更多解释,请告诉我