input/output/recurrent BiLSTM_Classifier 中的 dropout 层以及它们如何影响模型和预测
input/output/recurrent dropout layers in BiLSTM_Classifier and how they affect the model and prediction
我想在 BiLSTM_Classifier 中的 input/output/recurrent dropout 层上有一些 understanding/information 以及它们如何影响模型和预测。
# Output drop out
model_out_dp = Sequential()
model_out_dp.add(Embedding(vocab_size, embedding_dim, input_length=maxlen,weights=[embedding_matrix],trainable=False))
model_out_dp.add(Bidirectional(LSTM(64)))
model_out_dp.add(Dropout(0.5))
model_out_dp.add(Dense(8, activation='softmax'))
# input drop out
model_input_dp = Sequential()
model_input_dp.add(Embedding(vocab_size, embedding_dim, input_length=maxlen,weights=[embedding_matrix],trainable=False))
model_input_dp.add(Bidirectional(LSTM(64,dropout=0.5)))
model_input_dp.add(Dense(8, activation='softmax'))
# recurrent drop out
model_rec_dp = Sequential()
model_rec_dp.add(Embedding(vocab_size, embedding_dim, input_length=maxlen,weights=[embedding_matrix],trainable=False))
model_rec_dp.add(Bidirectional(LSTM(64,recurrent_dropout=0.5)))
model_rec_dp.add(Dense(8, activation='softmax'))
首先,我们按照规则将“S”和“A”分成组——我们为每个 S 分配一个唯一的“组”,后跟任意数量(包括 none)的 A。我们还按序列对每个组中的元素进行编号
df['group'] = (df['First']=='S').cumsum()
df['el'] = df.groupby('group').cumcount()
看起来像这样:
First Second group el
-- ------- ------------------------------------------------- ------- ----
0 S Keeping the Secret of Genetic Testing 1 0
1 S What is genetic risk ? 2 0
2 S Genetic risk refers more to your chance of inh... 3 0
3 A 3 4|||Rloc-||||||REQUIRED|||-NONE-|||0 3 1
4 S People get certain disease because of genetic ... 4 0
5 A 1 2|||Wci|||develop|||REQUIRED|||-NONE-|||0 4 1
6 A 3 4|||Nn|||diseases|||REQUIRED|||-NONE-|||0 4 2
7 S How much a genetic change tells us about your ... 5 0
8 S If your genetic results indicate that you have... 6 0
9 A 8 8|||ArtOrDet|||the|||REQUIRED|||-NONE-|||0 6 1
现在我们将多索引设置为'group'和'el'然后unstack
'el'进入headers
df.set_index(['group','el'])['Second'].unstack(level=1)
看起来像
group 0 1 2
------- ------------------------------------------------- -------------------------------------------- -------------------------------------------
1 Keeping the Secret of Genetic Testing nan nan
2 What is genetic risk ? nan nan
3 Genetic risk refers more to your chance of inh... 3 4|||Rloc-||||||REQUIRED|||-NONE-|||0 nan
4 People get certain disease because of genetic ... 1 2|||Wci|||develop|||REQUIRED|||-NONE-|||0 3 4|||Nn|||diseases|||REQUIRED|||-NONE-|||0
5 How much a genetic change tells us about your ... nan nan
6 If your genetic results indicate that you have... 8 8|||ArtOrDet|||the|||REQUIRED|||-NONE-|||0 nan
这看起来很像您想要的,除了您可以根据需要使用 .rename(columns = {...})
更改的列名称,以及 .fillna(0)
如果您想将 NaN 替换为 0s
我想在 BiLSTM_Classifier 中的 input/output/recurrent dropout 层上有一些 understanding/information 以及它们如何影响模型和预测。
# Output drop out
model_out_dp = Sequential()
model_out_dp.add(Embedding(vocab_size, embedding_dim, input_length=maxlen,weights=[embedding_matrix],trainable=False))
model_out_dp.add(Bidirectional(LSTM(64)))
model_out_dp.add(Dropout(0.5))
model_out_dp.add(Dense(8, activation='softmax'))
# input drop out
model_input_dp = Sequential()
model_input_dp.add(Embedding(vocab_size, embedding_dim, input_length=maxlen,weights=[embedding_matrix],trainable=False))
model_input_dp.add(Bidirectional(LSTM(64,dropout=0.5)))
model_input_dp.add(Dense(8, activation='softmax'))
# recurrent drop out
model_rec_dp = Sequential()
model_rec_dp.add(Embedding(vocab_size, embedding_dim, input_length=maxlen,weights=[embedding_matrix],trainable=False))
model_rec_dp.add(Bidirectional(LSTM(64,recurrent_dropout=0.5)))
model_rec_dp.add(Dense(8, activation='softmax'))
首先,我们按照规则将“S”和“A”分成组——我们为每个 S 分配一个唯一的“组”,后跟任意数量(包括 none)的 A。我们还按序列对每个组中的元素进行编号
df['group'] = (df['First']=='S').cumsum()
df['el'] = df.groupby('group').cumcount()
看起来像这样:
First Second group el
-- ------- ------------------------------------------------- ------- ----
0 S Keeping the Secret of Genetic Testing 1 0
1 S What is genetic risk ? 2 0
2 S Genetic risk refers more to your chance of inh... 3 0
3 A 3 4|||Rloc-||||||REQUIRED|||-NONE-|||0 3 1
4 S People get certain disease because of genetic ... 4 0
5 A 1 2|||Wci|||develop|||REQUIRED|||-NONE-|||0 4 1
6 A 3 4|||Nn|||diseases|||REQUIRED|||-NONE-|||0 4 2
7 S How much a genetic change tells us about your ... 5 0
8 S If your genetic results indicate that you have... 6 0
9 A 8 8|||ArtOrDet|||the|||REQUIRED|||-NONE-|||0 6 1
现在我们将多索引设置为'group'和'el'然后unstack
'el'进入headers
df.set_index(['group','el'])['Second'].unstack(level=1)
看起来像
group 0 1 2
------- ------------------------------------------------- -------------------------------------------- -------------------------------------------
1 Keeping the Secret of Genetic Testing nan nan
2 What is genetic risk ? nan nan
3 Genetic risk refers more to your chance of inh... 3 4|||Rloc-||||||REQUIRED|||-NONE-|||0 nan
4 People get certain disease because of genetic ... 1 2|||Wci|||develop|||REQUIRED|||-NONE-|||0 3 4|||Nn|||diseases|||REQUIRED|||-NONE-|||0
5 How much a genetic change tells us about your ... nan nan
6 If your genetic results indicate that you have... 8 8|||ArtOrDet|||the|||REQUIRED|||-NONE-|||0 nan
这看起来很像您想要的,除了您可以根据需要使用 .rename(columns = {...})
更改的列名称,以及 .fillna(0)
如果您想将 NaN 替换为 0s