将 Tensorflow 输入管道与 skflow/tf 学习结合使用

Question

我已按照 Tensorflow Reading Data 指南以 TFRecord 的形式获取我的应用程序数据，并在我的输入管道中使用 TFRecordReader 来读取此数据。

我正在阅读有关使用 skflow/tf.learn 构建简单回归器的指南，但我看不出如何通过这些工具使用我的输入数据。

在以下代码中，应用程序在 regressor.fit(..) 调用时失败，ValueError: setting an array element with a sequence.。

错误：

Traceback (most recent call last):
  File ".../tf.py", line 138, in <module>
    run()
  File ".../tf.py", line 86, in run
    regressor.fit(x, labels)
  File ".../site-packages/tensorflow/contrib/learn/python/learn/estimators/base.py", line 218, in fit
    self.batch_size)
  File ".../site-packages/tensorflow/contrib/learn/python/learn/io/data_feeder.py", line 99, in setup_train_data_feeder
    return data_feeder_cls(X, y, n_classes, batch_size)
  File ".../site-packages/tensorflow/contrib/learn/python/learn/io/data_feeder.py", line 191, in __init__
    self.X = check_array(X, dtype=x_dtype)
  File ".../site-packages/tensorflow/contrib/learn/python/learn/io/data_feeder.py", line 161, in check_array
    array = np.array(array, dtype=dtype, order=None, copy=False)

ValueError: setting an array element with a sequence.

代码：

import tensorflow as tf
import tensorflow.contrib.learn as learn

def inputs():
    with tf.name_scope('input'):
        filename_queue = tf.train.string_input_producer([filename])

        reader = tf.TFRecordReader()
        _, serialized_example = reader.read(filename_queue)

        features = tf.parse_single_example(serialized_example, feature_spec)
        labels = features.pop('actual')
        some_feature = features['some_feature']

        features_batch, labels_batch = tf.train.shuffle_batch(
            [some_feature, labels], batch_size=batch_size, capacity=capacity,
            min_after_dequeue=min_after_dequeue)

        return features_batch, labels_batch


def run():
    with tf.Graph().as_default():
        x, labels = inputs()

        # regressor = learn.TensorFlowDNNRegressor(hidden_units=[10, 20, 10])
        regressor = learn.TensorFlowLinearRegressor()

        regressor.fit(x, labels)
        ...

看起来 check_array 调用需要一个真正的数组，而不是张量。我可以做些什么来将我的数据整理成正确的形状吗？

Answer 1

您使用的 API 似乎已贬值。如果您使用更现代的 tf.contrib.learn.LinearRegressor (I think >= 1.0), you are supposed to specify the input_fn，它基本上会生成输入和标签。我认为在您的示例中，这就像将 run 函数更改为：

一样简单

def run():
    with tf.Graph().as_default():
        regressor = tf.contrib.learn.LinearRegressor()
        regressor.fit(input_fn=my_input_fn)

然后定义一个名为 my_input_fn 的输入函数。从 the docs 开始，此输入函数采用以下形式：

def my_input_fn():

    # Preprocess your data here...

    # ...then return 1) a mapping of feature columns to Tensors with
    # the corresponding feature data, and 2) a Tensor containing labels
    return feature_cols, labels

我认为文档可以帮助您完成剩下的工作。我很难从这里说出你应该如何在没有看到你的数据的情况下进行。

将 Tensorflow 输入管道与 skflow/tf 学习结合使用

Using a Tensorflow input pipeline with skflow/tf learn

python

tensorflow

skflow