程序在 Tensorflow 1.6 中挂起 Estimator.evaluate
Program hangs on Estimator.evaluate in Tensorflow 1.6
作为学习工具,我正在尝试做一些简单的事情。
我有两个训练 CSV 文件:
一个包含 0 和 1 的 36 列(3500 条记录)的文件。我将此文件设想为一个扁平的 6x6 矩阵。
我有另一个 CSV 文件,其中包含 1 列真实值 0 或 1(3500 条记录),它指示 6x6 矩阵对角线中的 6 个元素中是否至少有 4 个为 1。
我还有两个测试 CSV 文件,除了每个文件有 500 条记录外,它们与训练文件的结构相同。
当我使用调试器逐步执行程序时,似乎...
estimator.train(
input_fn=lambda: get_inputs(x_paths=[x_train_file], y_paths=[y_train_file], batch_size=32), steps=100)
...运行正常。我在检查点目录中看到了文件,在 Tensorboard 中看到了损失函数图。
但是当程序到达...
eval_result = estimator.evaluate(
input_fn=lambda: get_inputs(x_paths=[x_test_file], y_paths=[y_test_file], batch_size=32))
...它只是挂起。
我已经检查了测试文件,我还尝试了 运行 和 estimator.evaluate 使用培训文件。仍然挂起
我正在使用 TensorFlow 1.6,Python3.6
以下为全部代码:
import tensorflow as tf
import os
import numpy as np
x_train_file = os.path.join('D:', 'Diag', '6x6_train.csv')
y_train_file = os.path.join('D:', 'Diag', 'HasDiag_train.csv')
x_test_file = os.path.join('D:', 'Diag', '6x6_test.csv')
y_test_file = os.path.join('D:', 'Diag', 'HasDiag_test.csv')
model_chkpt = os.path.join('D:', 'Diag', "checkpoints")
def get_inputs(
count=None, shuffle=True, buffer_size=1000, batch_size=32,
num_parallel_calls=8, x_paths=[x_train_file], y_paths=[y_train_file]):
"""
Get x, y inputs.
Args:
count: number of epochs. None indicates infinite epochs.
shuffle: whether or not to shuffle the dataset
buffer_size: used in shuffle
batch_size: size of batch. See outputs below
num_parallel_calls: used in map. Note if > 1, intra-batch ordering
will be shuffled
x_paths: list of paths to x-value files.
y_paths: list of paths to y-value files.
Returns:
x: (batch_size, 6, 6) tensor
y: (batch_size, 2) tensor of 1-hot labels
"""
def x_map(line):
n_dims = 6
columns = [str(i1) for i1 in range(n_dims**2)]
# Decode the line into its fields
fields = tf.decode_csv(line, record_defaults=[[0]] * (n_dims ** 2))
# Pack the result into a dictionary
features = dict(zip(columns, fields))
return features
def y_map(line):
y_row = tf.string_to_number(line, out_type=tf.int32)
return y_row
def xy_map(x, y):
return x_map(x), y_map(y)
x_ds = tf.data.TextLineDataset(x_train_file)
y_ds = tf.data.TextLineDataset(y_train_file)
combined = tf.data.Dataset.zip((x_ds, y_ds))
combined = combined.repeat(count=count)
if shuffle:
combined = combined.shuffle(buffer_size)
combined = combined.map(xy_map, num_parallel_calls=num_parallel_calls)
combined = combined.batch(batch_size)
x, y = combined.make_one_shot_iterator().get_next()
return x, y
columns = [str(i1) for i1 in range(6 ** 2)]
feature_columns = [
tf.feature_column.numeric_column(name)
for name in columns]
estimator = tf.estimator.DNNClassifier(feature_columns=feature_columns,
hidden_units=[18, 9],
activation_fn=tf.nn.relu,
n_classes=2,
model_dir=model_chkpt)
estimator.train(
input_fn=lambda: get_inputs(x_paths=[x_train_file], y_paths=[y_train_file], batch_size=32), steps=100)
eval_result = estimator.evaluate(
input_fn=lambda: get_inputs(x_paths=[x_test_file], y_paths=[y_test_file], batch_size=32))
print('\nTest set accuracy: {accuracy:0.3f}\n'.format(**eval_result))
有两个参数导致了这个:
tf.data.Dataset.repeat
有一个 count
参数:
count
: (Optional.) A tf.int64
scalar tf.Tensor
, representing the
number of times the dataset should be repeated. The default behavior
(if count
is None
or -1
) is for the dataset be repeated indefinitely.
在您的例子中,count
始终是 None
,因此数据集会无限重复。
tf.estimator.Estimator.evaluate
有 steps
参数:
steps
: Number of steps for which to evaluate model. If None
, evaluates until input_fn
raises an end-of-input exception.
为训练设置了步骤,但没有为评估设置步骤,因此估计器是 运行 直到 input_fn
引发输入结束异常,如上所述,永远不会发生。
你应该设置其中一个,我认为count=1
是最合理的评价。
作为学习工具,我正在尝试做一些简单的事情。
我有两个训练 CSV 文件:
一个包含 0 和 1 的 36 列(3500 条记录)的文件。我将此文件设想为一个扁平的 6x6 矩阵。 我有另一个 CSV 文件,其中包含 1 列真实值 0 或 1(3500 条记录),它指示 6x6 矩阵对角线中的 6 个元素中是否至少有 4 个为 1。
我还有两个测试 CSV 文件,除了每个文件有 500 条记录外,它们与训练文件的结构相同。
当我使用调试器逐步执行程序时,似乎...
estimator.train(
input_fn=lambda: get_inputs(x_paths=[x_train_file], y_paths=[y_train_file], batch_size=32), steps=100)
...运行正常。我在检查点目录中看到了文件,在 Tensorboard 中看到了损失函数图。
但是当程序到达...
eval_result = estimator.evaluate(
input_fn=lambda: get_inputs(x_paths=[x_test_file], y_paths=[y_test_file], batch_size=32))
...它只是挂起。
我已经检查了测试文件,我还尝试了 运行 和 estimator.evaluate 使用培训文件。仍然挂起
我正在使用 TensorFlow 1.6,Python3.6
以下为全部代码:
import tensorflow as tf
import os
import numpy as np
x_train_file = os.path.join('D:', 'Diag', '6x6_train.csv')
y_train_file = os.path.join('D:', 'Diag', 'HasDiag_train.csv')
x_test_file = os.path.join('D:', 'Diag', '6x6_test.csv')
y_test_file = os.path.join('D:', 'Diag', 'HasDiag_test.csv')
model_chkpt = os.path.join('D:', 'Diag', "checkpoints")
def get_inputs(
count=None, shuffle=True, buffer_size=1000, batch_size=32,
num_parallel_calls=8, x_paths=[x_train_file], y_paths=[y_train_file]):
"""
Get x, y inputs.
Args:
count: number of epochs. None indicates infinite epochs.
shuffle: whether or not to shuffle the dataset
buffer_size: used in shuffle
batch_size: size of batch. See outputs below
num_parallel_calls: used in map. Note if > 1, intra-batch ordering
will be shuffled
x_paths: list of paths to x-value files.
y_paths: list of paths to y-value files.
Returns:
x: (batch_size, 6, 6) tensor
y: (batch_size, 2) tensor of 1-hot labels
"""
def x_map(line):
n_dims = 6
columns = [str(i1) for i1 in range(n_dims**2)]
# Decode the line into its fields
fields = tf.decode_csv(line, record_defaults=[[0]] * (n_dims ** 2))
# Pack the result into a dictionary
features = dict(zip(columns, fields))
return features
def y_map(line):
y_row = tf.string_to_number(line, out_type=tf.int32)
return y_row
def xy_map(x, y):
return x_map(x), y_map(y)
x_ds = tf.data.TextLineDataset(x_train_file)
y_ds = tf.data.TextLineDataset(y_train_file)
combined = tf.data.Dataset.zip((x_ds, y_ds))
combined = combined.repeat(count=count)
if shuffle:
combined = combined.shuffle(buffer_size)
combined = combined.map(xy_map, num_parallel_calls=num_parallel_calls)
combined = combined.batch(batch_size)
x, y = combined.make_one_shot_iterator().get_next()
return x, y
columns = [str(i1) for i1 in range(6 ** 2)]
feature_columns = [
tf.feature_column.numeric_column(name)
for name in columns]
estimator = tf.estimator.DNNClassifier(feature_columns=feature_columns,
hidden_units=[18, 9],
activation_fn=tf.nn.relu,
n_classes=2,
model_dir=model_chkpt)
estimator.train(
input_fn=lambda: get_inputs(x_paths=[x_train_file], y_paths=[y_train_file], batch_size=32), steps=100)
eval_result = estimator.evaluate(
input_fn=lambda: get_inputs(x_paths=[x_test_file], y_paths=[y_test_file], batch_size=32))
print('\nTest set accuracy: {accuracy:0.3f}\n'.format(**eval_result))
有两个参数导致了这个:
tf.data.Dataset.repeat
有一个count
参数:count
: (Optional.) Atf.int64
scalartf.Tensor
, representing the number of times the dataset should be repeated. The default behavior (ifcount
isNone
or-1
) is for the dataset be repeated indefinitely.在您的例子中,
count
始终是None
,因此数据集会无限重复。tf.estimator.Estimator.evaluate
有steps
参数:steps
: Number of steps for which to evaluate model. IfNone
, evaluates untilinput_fn
raises an end-of-input exception.为训练设置了步骤,但没有为评估设置步骤,因此估计器是 运行 直到
input_fn
引发输入结束异常,如上所述,永远不会发生。
你应该设置其中一个,我认为count=1
是最合理的评价。