张量流中的自定义辍学
Custom dropout in tensorflow
我正在用一些数据训练 DNN 模型,并希望分析学习到的权重以了解我正在研究的真实系统(生物学中的信号级联)。我想有人会说我正在使用人工神经网络来学习生物神经网络。
对于我的每个训练示例,我都删除了一个基因,该基因负责顶层的信号传递。
当我将这个信号级联建模为神经网络,并删除第一个隐藏层中的一个节点时,我意识到我正在做真实版本的 dropout。
因此我想使用 dropout 来训练我的模型,但是我在网上看到的 dropout 的实现似乎随机丢弃了一个节点。我需要的是一种为每个训练示例指定要丢弃哪个节点的方法。
关于如何实施这个的任何建议?我对任何软件包都持开放态度,但现在我已经完成的所有工作都在 Tensorflow 中,所以我很感激使用该框架的解决方案。
对于那些喜欢细节解释的人:
我有 10 个输入变量,第一层完全连接到 32 个 relu 节点,第二层完全连接到第二层(relu),第二层完全连接到输出(线性因为我正在做回归).
除了10个输入变量外,我还恰好知道28个节点中应该去掉哪个。
有没有一种方法可以在训练时指定它?
这是我目前使用的代码:
num_stresses = 10
num_kinase = 32
num_transcription_factors = 200
num_genes = 6692
# Build neural network
# Input variables (10)
# Which Node to dropout (32)
stress = tflearn.input_data(shape=[None, num_stresses])
kinase_deletion = tflearn.input_data(shape=[None, num_kinase])
# This is the layer that I want to perform selective dropout on,
# I should be able to specify which of the 32 nodes should output zero
# based on a 1X32 vector of ones and zeros.
kinase = tflearn.fully_connected(stress, num_kinase, activation='relu')
transcription_factor = tflearn.fully_connected(kinase, num_transcription_factors, activation='relu')
gene = tflearn.fully_connected(transcription_factor, num_genes, activation='linear')
adam = tflearn.Adam(learning_rate=0.00001, beta1=0.99)
regression = tflearn.regression(gene, optimizer=adam, loss='mean_square', metric='R2')
# Define model
model = tflearn.DNN(regression, tensorboard_verbose=1)
我会提供您的输入变量和一个大小相等的向量,除了您要删除的那个是 0 之外,所有 1 都为 1。
然后第一个操作应该是乘法以将要删除的基因清零。从那以后,它应该和你现在拥有的完全一样。
您可以在将基因传递给 tensorflow 之前进行乘法运算(将基因置零),或者添加另一个占位符并将其输入 feed_dict 中的图表,就像处理变量一样。后一种可能会更好。
如果你需要删除一个隐藏节点(在第 2 层),它只是另一个 1 和 0 的向量。
让我知道这是否有效,或者您是否需要更多帮助。
编辑:
好的,所以我并没有真正使用 tflearn 很多(我只是做了常规的 tensorflow),但我认为你可以结合使用 tensorflow 和 tflearn。基本上,我添加了 tf.multiply
。您可能需要添加另一个 tflearn.input_data(shape =[num_stresses])
和 tflearn.input_data(shape =[num_kinase])
来为 stresses_dropout_vector
和 kinase_dropout_vector
提供占位符。当然,您可以更改这两个向量中零的数量和位置。
import tensorflow as tf ###### New ######
import tflearn
num_stresses = 10
num_kinase = 32
num_transcription_factors = 200
num_genes = 6692
stresses_dropout_vector = [1] * num_stresses ###### NEW ######
stresses_dropout_vector[desired_node_to_drop] = 0 ###### NEW ######
kinase_dropout_vector = [1] * num_kinase ###### NEW ######
kinase_dropout_vector[desired_hidden_node_to_drop] = 0 ###### NEW ######
# Build neural network
# Input variables (10)
# Which Node to dropout (32)
stress = tflearn.input_data(shape=[None, num_stresses])
kinase_deletion = tflearn.input_data(shape=[None, num_kinase])
# This is the layer that I want to perform selective dropout on,
# I should be able to specify which of the 32 nodes should output zero
# based on a 1X32 vector of ones and zeros.
stress_dropout = tf.multiply(stress, stresses_dropout_vector) ###### NEW ###### Drops out an input
kinase = tflearn.fully_connected(stress_dropout, num_kinase, activation='relu') ### changed stress to stress_dropout
kinase_dropout = tf.multiply(kinase, kinase_dropout_vector) ###### NEW ###### Drops out a hidden node
transcription_factor = tflearn.fully_connected(kinase_dropout, num_transcription_factors, activation='relu') ### changed kinase to kinase_dropout
gene = tflearn.fully_connected(transcription_factor, num_genes, activation='linear')
adam = tflearn.Adam(learning_rate=0.00001, beta1=0.99)
regression = tflearn.regression(gene, optimizer=adam, loss='mean_square', metric='R2')
# Define model
model = tflearn.DNN(regression, tensorboard_verbose=1)
如果在 tensorflow 中添加不起作用,您只需要找到一个常规的旧 tflearn.multiply 函数,它对给定的两个给定 tensors/vectors.
进行元素明智的乘法
希望对您有所帮助。
为了完整起见,这是我的最终实现:
import numpy as np
import pandas as pd
import tflearn
import tensorflow as tf
meta = pd.read_csv('../../input/nn/meta.csv')
experiments = meta["Unnamed: 0"]
del meta["Unnamed: 0"]
stress_one_hot = pd.get_dummies(meta["train"])
kinase_deletion = pd.get_dummies(meta["Strain"])
kinase_one_hot = 1 - kinase_deletion
expression = pd.read_csv('../../input/nn/data.csv')
genes = expression["Unnamed: 0"]
del expression["Unnamed: 0"] # This holds the gene names just so you know...
expression = expression.transpose()
# Set up data for tensorflow
# Gene expression
target = expression
target = np.array(expression, dtype='float32')
target_mean = target.mean(axis=0, keepdims=True)
target_std = target.std(axis=0, keepdims=True)
target = target - target_mean
target = target / target_std
# Stress information
data1 = stress_one_hot
data1 = np.array(data1, dtype='float32')
data_mean = data1.mean(axis=0, keepdims=True)
data_std = data1.std(axis=0, keepdims=True)
data1 = data1 - data_mean
data1 = data1 / data_std
# Kinase information
data2 = kinase_one_hot
data2 = np.array(data2, dtype='float32')
# For Reference
# data1.shape
# #(301, 10)
# data2.shape
# #(301, 29)
# Build the Neural Network
num_stresses = 10
num_kinase = 29
num_transcription_factors = 200
num_genes = 6692
# Build neural network
# Input variables (10)
# Which Node to dropout (32)
stress = tflearn.input_data(shape=[None, num_stresses])
kinase_deletion = tflearn.input_data(shape=[None, num_kinase])
# This is the layer that I want to perform selective dropout on,
# I should be able to specify which of the 32 nodes should output zero
# based on a 1X32 vector of ones and zeros.
kinase = tflearn.fully_connected(stress, num_kinase, activation='relu')
kinase_dropout = tf.mul(kinase, kinase_deletion)
transcription_factor = tflearn.fully_connected(kinase_dropout, num_transcription_factors, activation='relu')
gene = tflearn.fully_connected(transcription_factor, num_genes, activation='linear')
adam = tflearn.Adam(learning_rate=0.00001, beta1=0.99)
regression = tflearn.regression(gene, optimizer=adam, loss='mean_square', metric='R2')
# Define model
model = tflearn.DNN(regression, tensorboard_verbose=1)
# Start training (apply gradient descent algorithm)
model.fit([data1, data2], target, n_epoch=20000, show_metric=True, shuffle=True)#,validation_set=0.05)
我正在用一些数据训练 DNN 模型,并希望分析学习到的权重以了解我正在研究的真实系统(生物学中的信号级联)。我想有人会说我正在使用人工神经网络来学习生物神经网络。
对于我的每个训练示例,我都删除了一个基因,该基因负责顶层的信号传递。
当我将这个信号级联建模为神经网络,并删除第一个隐藏层中的一个节点时,我意识到我正在做真实版本的 dropout。
因此我想使用 dropout 来训练我的模型,但是我在网上看到的 dropout 的实现似乎随机丢弃了一个节点。我需要的是一种为每个训练示例指定要丢弃哪个节点的方法。
关于如何实施这个的任何建议?我对任何软件包都持开放态度,但现在我已经完成的所有工作都在 Tensorflow 中,所以我很感激使用该框架的解决方案。
对于那些喜欢细节解释的人:
我有 10 个输入变量,第一层完全连接到 32 个 relu 节点,第二层完全连接到第二层(relu),第二层完全连接到输出(线性因为我正在做回归).
除了10个输入变量外,我还恰好知道28个节点中应该去掉哪个。
有没有一种方法可以在训练时指定它?
这是我目前使用的代码:
num_stresses = 10
num_kinase = 32
num_transcription_factors = 200
num_genes = 6692
# Build neural network
# Input variables (10)
# Which Node to dropout (32)
stress = tflearn.input_data(shape=[None, num_stresses])
kinase_deletion = tflearn.input_data(shape=[None, num_kinase])
# This is the layer that I want to perform selective dropout on,
# I should be able to specify which of the 32 nodes should output zero
# based on a 1X32 vector of ones and zeros.
kinase = tflearn.fully_connected(stress, num_kinase, activation='relu')
transcription_factor = tflearn.fully_connected(kinase, num_transcription_factors, activation='relu')
gene = tflearn.fully_connected(transcription_factor, num_genes, activation='linear')
adam = tflearn.Adam(learning_rate=0.00001, beta1=0.99)
regression = tflearn.regression(gene, optimizer=adam, loss='mean_square', metric='R2')
# Define model
model = tflearn.DNN(regression, tensorboard_verbose=1)
我会提供您的输入变量和一个大小相等的向量,除了您要删除的那个是 0 之外,所有 1 都为 1。
然后第一个操作应该是乘法以将要删除的基因清零。从那以后,它应该和你现在拥有的完全一样。
您可以在将基因传递给 tensorflow 之前进行乘法运算(将基因置零),或者添加另一个占位符并将其输入 feed_dict 中的图表,就像处理变量一样。后一种可能会更好。
如果你需要删除一个隐藏节点(在第 2 层),它只是另一个 1 和 0 的向量。
让我知道这是否有效,或者您是否需要更多帮助。
编辑:
好的,所以我并没有真正使用 tflearn 很多(我只是做了常规的 tensorflow),但我认为你可以结合使用 tensorflow 和 tflearn。基本上,我添加了 tf.multiply
。您可能需要添加另一个 tflearn.input_data(shape =[num_stresses])
和 tflearn.input_data(shape =[num_kinase])
来为 stresses_dropout_vector
和 kinase_dropout_vector
提供占位符。当然,您可以更改这两个向量中零的数量和位置。
import tensorflow as tf ###### New ######
import tflearn
num_stresses = 10
num_kinase = 32
num_transcription_factors = 200
num_genes = 6692
stresses_dropout_vector = [1] * num_stresses ###### NEW ######
stresses_dropout_vector[desired_node_to_drop] = 0 ###### NEW ######
kinase_dropout_vector = [1] * num_kinase ###### NEW ######
kinase_dropout_vector[desired_hidden_node_to_drop] = 0 ###### NEW ######
# Build neural network
# Input variables (10)
# Which Node to dropout (32)
stress = tflearn.input_data(shape=[None, num_stresses])
kinase_deletion = tflearn.input_data(shape=[None, num_kinase])
# This is the layer that I want to perform selective dropout on,
# I should be able to specify which of the 32 nodes should output zero
# based on a 1X32 vector of ones and zeros.
stress_dropout = tf.multiply(stress, stresses_dropout_vector) ###### NEW ###### Drops out an input
kinase = tflearn.fully_connected(stress_dropout, num_kinase, activation='relu') ### changed stress to stress_dropout
kinase_dropout = tf.multiply(kinase, kinase_dropout_vector) ###### NEW ###### Drops out a hidden node
transcription_factor = tflearn.fully_connected(kinase_dropout, num_transcription_factors, activation='relu') ### changed kinase to kinase_dropout
gene = tflearn.fully_connected(transcription_factor, num_genes, activation='linear')
adam = tflearn.Adam(learning_rate=0.00001, beta1=0.99)
regression = tflearn.regression(gene, optimizer=adam, loss='mean_square', metric='R2')
# Define model
model = tflearn.DNN(regression, tensorboard_verbose=1)
如果在 tensorflow 中添加不起作用,您只需要找到一个常规的旧 tflearn.multiply 函数,它对给定的两个给定 tensors/vectors.
进行元素明智的乘法希望对您有所帮助。
为了完整起见,这是我的最终实现:
import numpy as np
import pandas as pd
import tflearn
import tensorflow as tf
meta = pd.read_csv('../../input/nn/meta.csv')
experiments = meta["Unnamed: 0"]
del meta["Unnamed: 0"]
stress_one_hot = pd.get_dummies(meta["train"])
kinase_deletion = pd.get_dummies(meta["Strain"])
kinase_one_hot = 1 - kinase_deletion
expression = pd.read_csv('../../input/nn/data.csv')
genes = expression["Unnamed: 0"]
del expression["Unnamed: 0"] # This holds the gene names just so you know...
expression = expression.transpose()
# Set up data for tensorflow
# Gene expression
target = expression
target = np.array(expression, dtype='float32')
target_mean = target.mean(axis=0, keepdims=True)
target_std = target.std(axis=0, keepdims=True)
target = target - target_mean
target = target / target_std
# Stress information
data1 = stress_one_hot
data1 = np.array(data1, dtype='float32')
data_mean = data1.mean(axis=0, keepdims=True)
data_std = data1.std(axis=0, keepdims=True)
data1 = data1 - data_mean
data1 = data1 / data_std
# Kinase information
data2 = kinase_one_hot
data2 = np.array(data2, dtype='float32')
# For Reference
# data1.shape
# #(301, 10)
# data2.shape
# #(301, 29)
# Build the Neural Network
num_stresses = 10
num_kinase = 29
num_transcription_factors = 200
num_genes = 6692
# Build neural network
# Input variables (10)
# Which Node to dropout (32)
stress = tflearn.input_data(shape=[None, num_stresses])
kinase_deletion = tflearn.input_data(shape=[None, num_kinase])
# This is the layer that I want to perform selective dropout on,
# I should be able to specify which of the 32 nodes should output zero
# based on a 1X32 vector of ones and zeros.
kinase = tflearn.fully_connected(stress, num_kinase, activation='relu')
kinase_dropout = tf.mul(kinase, kinase_deletion)
transcription_factor = tflearn.fully_connected(kinase_dropout, num_transcription_factors, activation='relu')
gene = tflearn.fully_connected(transcription_factor, num_genes, activation='linear')
adam = tflearn.Adam(learning_rate=0.00001, beta1=0.99)
regression = tflearn.regression(gene, optimizer=adam, loss='mean_square', metric='R2')
# Define model
model = tflearn.DNN(regression, tensorboard_verbose=1)
# Start training (apply gradient descent algorithm)
model.fit([data1, data2], target, n_epoch=20000, show_metric=True, shuffle=True)#,validation_set=0.05)