RNN序列学习

RNN sequence learning

我是 TensorFlow RNN 预测的新手。 我正在尝试将 RNN 与 BasicLSTMCell 一起使用来预测序列,例如

1,2,3,4,5 ->6
3,4,5,6,7 ->8
35,36,37,38,39 ->40

我的代码没有报错,但是每批的输出似乎都一样,训练时成本似乎没有降低。

当我将所有训练数据除以 100

0.01,0.02,0.03,0.04,0.05 ->0.06
0.03,0.04,0.05,0.06,0.07 ->0.08 
0.35,0.36,0.37,0.38,0.39 ->0.40

结果还不错,预测值和实际值的相关性很高(0.9998)。

我怀疑问题是因为整数和浮点数?但我无法解释原因。任何人都可以帮忙吗?非常感谢!!

这是代码

library(tensorflow)
start=sample(1:1000, 100000, T)
start1= start+1
start2=start1+1
start3= start2+1
start4=start3+1
start5= start4+1
start6=start5+1
label=start6+1
data=data.frame(start, start1, start2, start3, start4, start5, start6, label)
data=as.matrix(data)
n = nrow(data)
trainIndex = sample(1:n, size = round(0.7*n), replace=FALSE)
train = data[trainIndex ,]
test = data[-trainIndex ,]
train_data= train[,1:7]
train_label= train[,8]
means=apply(train_data, 2, mean)
sds= apply(train_data, 2, sd)
train_data=(train_data-means)/sds
test_data=test[,1:7]
test_data=(test_data-means)/sds
test_label=test[,8]
batch_size = 50L            
n_inputs = 1L               # MNIST data input (img shape: 28*28)
n_steps = 7L                # time steps
n_hidden_units = 10L        # neurons in hidden layer
n_outputs = 1L             # MNIST classes (0-9 digits)
x = tf$placeholder(tf$float32, shape(NULL, n_steps, n_inputs))
y = tf$placeholder(tf$float32, shape(NULL, 1L))
weights_in= tf$Variable(tf$random_normal(shape(n_inputs, n_hidden_units)))
weights_out= tf$Variable(tf$random_normal(shape(n_hidden_units, 1L)))
biases_in=tf$Variable(tf$constant(0.1, shape= shape(n_hidden_units )))
biases_out = tf$Variable(tf$constant(0.1, shape=shape(1L)))
RNN=function(X, weights_in, weights_out, biases_in, biases_out)
{
    X = tf$reshape(X, shape=shape(-1, n_inputs))
    X_in = tf$sigmoid (tf$matmul(X, weights_in) + biases_in)
    X_in = tf$reshape(X_in, shape=shape(-1, n_steps, n_hidden_units)
    lstm_cell = tf$contrib$rnn$BasicLSTMCell(n_hidden_units, forget_bias=1.0, state_is_tuple=T)
    init_state = lstm_cell$zero_state(batch_size, dtype=tf$float32)
    outputs_final_state = tf$nn$dynamic_rnn(lstm_cell, X_in, initial_state=init_state, time_major=F)
    outputs= tf$unstack(tf$transpose(outputs_final_state[[1]], shape(1,0,2)))
    results =  tf$matmul(outputs[[length(outputs)]], weights_out) + biases_out
    return(results)
}
pred = RNN(x, weights_in, weights_out, biases_in, biases_out)
cost = tf$losses$mean_squared_error(pred, y)
train_op = tf$contrib$layers$optimize_loss(loss=cost, global_step=tf$contrib$framework$get_global_step(), learning_rate=0.05, optimizer="SGD")
init <- tf$global_variables_initializer()
sess <- tf$Session()
sess.run(init)
    step = 0
while (step < 1000)
{
  train_data2= train_data[(step*batch_size+1) : (step*batch_size+batch_size) ,  ]
  train_label2=train_label[(step*batch_size+1):(step*batch_size+batch_size)]
  batch_xs <- sess$run(tf$reshape(train_data2, shape(batch_size, n_steps, n_inputs))) # Reshape
  batch_ys= matrix(train_label2, ncol=1)
  sess$run(train_op, feed_dict = dict(x = batch_xs, y= batch_ys))
 mycost <- sess$run(cost, feed_dict = dict(x = batch_xs, y= batch_ys))
print (mycost)
 test_data2= test_data[(0*batch_size+1) : (0*batch_size+batch_size) ,  ]
  test_label2=test_label[(0*batch_size+1):(0*batch_size+batch_size)]
   batch_xs <- sess$run(tf$reshape(test_data2, shape(batch_size, n_steps, n_inputs))) # Reshape
  batch_ys= matrix(test_label2, ncol=1)
step=step+1
}

首先,始终对网络输入进行归一化非常有用(有不同的方法,除以最大值、减去均值和除以标准差等等)。这将对您的优化器有很大帮助。

其次,实际上对您的情况最重要的是,在 RNN 输出之后您应用了 sigmoid 函数。如果您检查 sigmoid 函数的图,您会发现它实际上将所有输入缩放到范围 (0,1)。所以基本上无论你的输入有多大,你的输出总是最多为 1。因此你不应该在回归问题的输出层使用任何激活函数。

希望对您有所帮助。