训练神经网络时损失值为0
Getting loss value as 0 when training a neural network
我不确定是否应该粘贴整个代码,但它是:
import tensorflow as tf
import numpy as np
import requests
from sklearn.model_selection import train_test_split
BATCH_SIZE = 20
#Get data
birthdata_url = 'http://springer.bme.gatech.edu/Ch17.Logistic/Logisticdat/lowbwt.dat'
birth_file = requests.get(birthdata_url)
birth_data = birth_file.text.split('\r\n')[5:]
birth_data = np.array([[x for x in y.split(' ') if len(x)>=1] for y in birth_data[1:] if len(y)>=1])
#Get x and y vals
y_vals = np.array([x[1] for x in birth_data]).reshape((-1,1))
x_vals = np.array([x[2:10] for x in birth_data])
#Split data
x_train, x_test, y_train, y_test = train_test_split(x_vals,y_vals,test_size=0.3)
#Placeholders
x_data = tf.placeholder(dtype=tf.float32,shape=[None,8])
y_data = tf.placeholder(dtype=tf.float32,shape=[None,1])
#Define our Neural Network
def init_weight(shape):
return tf.Variable(tf.truncated_normal(shape=shape,stddev=0.1))
def init_bias(shape):
return tf.Variable(tf.constant(0.1,shape=shape))
def fully_connected(inp_layer,weights,biases):
return tf.nn.relu(tf.matmul(inp_layer,weights)+biases)
def nn(x):
w1 = init_weight([8,25])
b1 = init_bias([25])
layer1 = fully_connected(x,w1,b1)
w2 = init_weight([25,10])
b2 = init_bias([10])
layer2 = fully_connected(layer1,w2,b2)
w3 = init_weight([10,3])
b3 = init_bias([3])
layer3 = fully_connected(layer2,w3,b3)
w4 = init_weight([3,1])
b4 = init_bias([1])
final_output = fully_connected(layer3,w4,b4)
return final_output
#Predicted values.
y_ = nn(x_data)
#Loss and training step.
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_data,logits=y_))
train_step = tf.train.AdamOptimizer(0.1).minimize(loss)
#Initalize session and global variables
sess = tf.Session()
sess.run(tf.global_variables_initializer())
#Accuracy
def get_accuracy(logits,labels):
batch_predicitons = np.argmax(logits,axis=1)
num_correct = np.sum(np.equal(batch_predicitons,labels))
return(100*num_correct/batch_predicitons.shape[0])
loss_vec = []
for i in range(500):
#Get random indexes and create batches.
rand_index = np.random.choice(len(x_train),size=BATCH_SIZE)
#x and y batch.
rand_x = x_train[rand_index]
rand_y = y_train[rand_index]
#Run the training step.
sess.run(train_step,feed_dict={x_data:rand_x,y_data:rand_y})
#Get the current loss.
temp_loss = sess.run(loss,feed_dict={x_data:x_test,y_data:y_test})
loss_vec.append(temp_loss)
if(i+1)%20==0:
print("Current Step is: {}, Loss: {}"
.format((i+1),
temp_loss))
#print("-----Test Accuracy: {}-----".format(get_accuracy(logits=sess.run(y_,feed_dict={x_data:x_test}),labels=y_test)))
当我运行我的程序时,我在训练时总是将损失值设置为0。我不确定问题可能出在哪里。但我有一些想法。
1)难道是我批量创建数据的方式?这似乎是一种不寻常的方式,但就我而言,它应该通过获取随机索引来工作,如 rand_index = np.random.choice(len(x_train),size=BATCH_SIZE)
。
2)这个没有意义,难道是因为数据是"small data"?
3)代码中有没有简单的错误?
4) 或者我真的有损失为 0。(这是最不可能的情况)
如果您能另外指出我在上面的代码中应该避免做什么,我将非常高兴。
谢谢。
我 运行 你的代码,这些是我发现的错误。
- 您的输入数据似乎是字符串。您应该将其转换为浮点数。
- 不要在最后一层使用relu。它应该直接馈入损失函数而没有非线性。
- 您应该使用
sigmoid_cross_entropy_with_logits
函数而不是 softmax 函数。 Sigmoid 用于二进制 classification,softmax 用于 multi-class classification.
- 可能是你的学习率太高了。我会试试低一点的。
我不确定是否应该粘贴整个代码,但它是:
import tensorflow as tf
import numpy as np
import requests
from sklearn.model_selection import train_test_split
BATCH_SIZE = 20
#Get data
birthdata_url = 'http://springer.bme.gatech.edu/Ch17.Logistic/Logisticdat/lowbwt.dat'
birth_file = requests.get(birthdata_url)
birth_data = birth_file.text.split('\r\n')[5:]
birth_data = np.array([[x for x in y.split(' ') if len(x)>=1] for y in birth_data[1:] if len(y)>=1])
#Get x and y vals
y_vals = np.array([x[1] for x in birth_data]).reshape((-1,1))
x_vals = np.array([x[2:10] for x in birth_data])
#Split data
x_train, x_test, y_train, y_test = train_test_split(x_vals,y_vals,test_size=0.3)
#Placeholders
x_data = tf.placeholder(dtype=tf.float32,shape=[None,8])
y_data = tf.placeholder(dtype=tf.float32,shape=[None,1])
#Define our Neural Network
def init_weight(shape):
return tf.Variable(tf.truncated_normal(shape=shape,stddev=0.1))
def init_bias(shape):
return tf.Variable(tf.constant(0.1,shape=shape))
def fully_connected(inp_layer,weights,biases):
return tf.nn.relu(tf.matmul(inp_layer,weights)+biases)
def nn(x):
w1 = init_weight([8,25])
b1 = init_bias([25])
layer1 = fully_connected(x,w1,b1)
w2 = init_weight([25,10])
b2 = init_bias([10])
layer2 = fully_connected(layer1,w2,b2)
w3 = init_weight([10,3])
b3 = init_bias([3])
layer3 = fully_connected(layer2,w3,b3)
w4 = init_weight([3,1])
b4 = init_bias([1])
final_output = fully_connected(layer3,w4,b4)
return final_output
#Predicted values.
y_ = nn(x_data)
#Loss and training step.
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_data,logits=y_))
train_step = tf.train.AdamOptimizer(0.1).minimize(loss)
#Initalize session and global variables
sess = tf.Session()
sess.run(tf.global_variables_initializer())
#Accuracy
def get_accuracy(logits,labels):
batch_predicitons = np.argmax(logits,axis=1)
num_correct = np.sum(np.equal(batch_predicitons,labels))
return(100*num_correct/batch_predicitons.shape[0])
loss_vec = []
for i in range(500):
#Get random indexes and create batches.
rand_index = np.random.choice(len(x_train),size=BATCH_SIZE)
#x and y batch.
rand_x = x_train[rand_index]
rand_y = y_train[rand_index]
#Run the training step.
sess.run(train_step,feed_dict={x_data:rand_x,y_data:rand_y})
#Get the current loss.
temp_loss = sess.run(loss,feed_dict={x_data:x_test,y_data:y_test})
loss_vec.append(temp_loss)
if(i+1)%20==0:
print("Current Step is: {}, Loss: {}"
.format((i+1),
temp_loss))
#print("-----Test Accuracy: {}-----".format(get_accuracy(logits=sess.run(y_,feed_dict={x_data:x_test}),labels=y_test)))
当我运行我的程序时,我在训练时总是将损失值设置为0。我不确定问题可能出在哪里。但我有一些想法。
1)难道是我批量创建数据的方式?这似乎是一种不寻常的方式,但就我而言,它应该通过获取随机索引来工作,如 rand_index = np.random.choice(len(x_train),size=BATCH_SIZE)
。
2)这个没有意义,难道是因为数据是"small data"?
3)代码中有没有简单的错误?
4) 或者我真的有损失为 0。(这是最不可能的情况)
如果您能另外指出我在上面的代码中应该避免做什么,我将非常高兴。
谢谢。
我 运行 你的代码,这些是我发现的错误。
- 您的输入数据似乎是字符串。您应该将其转换为浮点数。
- 不要在最后一层使用relu。它应该直接馈入损失函数而没有非线性。
- 您应该使用
sigmoid_cross_entropy_with_logits
函数而不是 softmax 函数。 Sigmoid 用于二进制 classification,softmax 用于 multi-class classification. - 可能是你的学习率太高了。我会试试低一点的。