从字符串列表创建 TfRecords 并在解码后在张量流中提供图形

Question

目的是创建一个 TfRecords 数据库。给定：我有 23 个文件夹，每个文件夹包含 7500 张图像和 23 个文本文件，每个文件有 7500 行描述单独文件夹中 7500 张图像的特征。

我通过这段代码创建了数据库：

import tensorflow as tf
import numpy as np
from PIL import Image

def _Float_feature(value):
    return tf.train.Feature(float_list=tf.train.FloatList(value=[value]))

def _bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

def _int64_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

def create_image_annotation_data():
    # Code to read images and features.
    # images represent a list of numpy array of images, and features_labels represent a list of strings
    # where each string represent the whole set of features for each image. 
    return images, features_labels

# This is the starting point of the program.
# Now I have the images stored as list of numpy array, and the features as list of strings.
images, annotations = create_image_annotation_data()

tfrecords_filename = "database.tfrecords"
writer = tf.python_io.TFRecordWriter(tfrecords_filename)

for img, ann in zip(images, annotations):

    # Note that the height and width are needed to reconstruct the original image.
    height = img.shape[0]
    width = img.shape[1]

    # This is how data is converted into binary
    img_raw = img.tostring()
    example = tf.train.Example(features=tf.train.Features(feature={
        'height': _int64_feature(height),
        'width': _int64_feature(width),
        'image_raw': _bytes_feature(img_raw),
        'annotation_raw': _bytes_feature(tf.compat.as_bytes(ann))
    }))

    writer.write(example.SerializeToString())

writer.close()

reconstructed_images = []

record_iterator = tf.python_io.tf_record_iterator(path=tfrecords_filename)

for string_record in record_iterator:
    example = tf.train.Example()
    example.ParseFromString(string_record)

    height = int(example.features.feature['height']
                 .int64_list
                 .value[0])

    width = int(example.features.feature['width']
                .int64_list
                .value[0])

    img_string = (example.features.feature['image_raw']
                  .bytes_list
                  .value[0])

    annotation_string = (example.features.feature['annotation_raw']
                         .bytes_list
                         .value[0])

    img_1d = np.fromstring(img_string, dtype=np.uint8)
    reconstructed_img = img_1d.reshape((height, width, -1))
    annotation_reconstructed = annotation_string.decode('utf-8')

因此，在将图像和文本转换为 tfRecords 并能够读取它们并将图像转换为 numpy 并将（二进制文本）转换为 python 中的字符串之后，我尝试通过使用a filename_queue with a reader （目的是为图形提供一批数据，而不是一次提供一个数据。此外，目的是通过不同的线程对示例队列进行入队和出队，因此，使网络训练更快）

因此，我使用了以下代码：

import tensorflow as tf
import numpy as np
import time

image_file_list = ["database.tfrecords"]
batch_size = 16

# Make a queue of file names including all the JPEG images files in the relative
# image directory.
filename_queue = tf.train.string_input_producer(image_file_list, num_epochs=1, shuffle=False)

reader = tf.TFRecordReader()

# Read a whole file from the queue, the first returned value in the tuple is the
# filename which we are ignoring.
_, serialized_example = reader.read(filename_queue)

features = tf.parse_single_example(
      serialized_example,
      # Defaults are not specified since both keys are required.
      features={
          'height': tf.FixedLenFeature([], tf.int64),
          'width': tf.FixedLenFeature([], tf.int64),
          'image_raw': tf.FixedLenFeature([], tf.string),
          'annotation_raw': tf.FixedLenFeature([], tf.string)
      })

image = tf.decode_raw(features['image_raw'], tf.uint8)
annotation = tf.decode_raw(features['annotation_raw'], tf.float32)

height = tf.cast(features['height'], tf.int32)
width = tf.cast(features['width'], tf.int32)

image = tf.reshape(image, [height, width, 3])

# Note that the minimum after dequeue is needed to make sure that the queue is not empty after dequeuing so that
# we don't run into errors
'''
min_after_dequeue = 100
capacity = min_after_dequeue + 3 * batch_size
ann, images_batch = tf.train.batch([annotation, image],
                                   shapes=[[1], [112, 112, 3]],
                                   batch_size=batch_size,
                                   capacity=capacity,
                                   num_threads=1)
'''

# Start a new session to show example output.
with tf.Session() as sess:
    merged = tf.summary.merge_all()
    train_writer = tf.summary.FileWriter('C:/Users/user/Documents/tensorboard_logs/New_Runs', sess.graph)

    # Required to get the filename matching to run.
    tf.global_variables_initializer().run()

    # Coordinate the loading of image files.
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)

    for steps in range(16):
        t1 = time.time()
        annotation_string, batch, summary = sess.run([annotation, image, merged])
        t2 = time.time()
        print('time to fetch 16 faces:', (t2 - t1))
        print(annotation_string)
        tf.summary.image("image_batch", image)
        train_writer.add_summary(summary, steps)

    # Finish off the filename queue coordinator.
    coord.request_stop()
    coord.join(threads)

最后，在运行上面的代码之后，我得到了如下错误： OutOfRangeError（回溯见上）：FIFOQueue '_0_input_producer' 已关闭且元素不足（请求 1，当前大小 0） [[节点：ReaderReadV2 = ReaderReadV2[_device="/job:localhost/replica:0/task:0/cpu:0"](TFRecordReaderV2, input_producer)]]

另一个问题：

如何解码二进制数据库 (tfrecords) 以检索存储的特征 "as python string data structure"。
如何使用 tf.train.batch 创建一批示例以供网络使用。

谢谢！！非常感谢任何帮助。

Answer 1

为了解决这个问题，coordinator 和 queue runner 都必须在 Session 中初始化。另外，由于epoch的数量是内部控制的，所以不是global variable，而是考虑local variable。因此，我们需要在告诉 queue_runner 开始将 file_names 排队到 Queue 之前初始化该局部变量。因此，这里有以下代码：

filename_queue = tf.train.string_input_producer(tfrecords_filename, num_epochs=num_epoch, shuffle=False, name='queue')
reader = tf.TFRecordReader()

key, serialized_example = reader.read(filename_queue)
features = tf.parse_single_example(
    serialized_example,
    # Defaults are not specified since both keys are required.
    features={
        'height': tf.FixedLenFeature([], tf.int64),
        'width': tf.FixedLenFeature([], tf.int64),
        'image_raw': tf.FixedLenFeature([], tf.string),
        'annotation_raw': tf.FixedLenFeature([], tf.string)
    })
...
init_op = tf.group(tf.local_variables_initializer(),
               tf.global_variables_initializer())
with tf.Session() as sess:
    sess.run(init_op)

    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)

现在应该可以了。

现在要在将图像输入网络之前收集一批图像，我们可以使用 tf.train.shuffle_batch 或 tf.train.batch。两者都有效。区别很简单。一个洗牌图像，另一个不洗牌。但请注意，定义一个线程数并使用 tf.train.batch 可能会打乱数据样本，因为排队 file_names 的线程之间存在竞争。无论如何，下面的代码应该在初始化 Queue 之后直接插入，如下所示：

min_after_dequeue = 100
num_threads = 1
capacity = min_after_dequeue + num_threads * batch_size
label_batch, images_batch = tf.train.batch([annotation, image],
                                       shapes=[[], [112, 112, 3]],
                                       batch_size=batch_size,
                                       capacity=capacity,
                                       num_threads=num_threads)

请注意，此处 tensors 的形状可能不同。碰巧 reader 正在解码大小为 [112, 112, 3] 的彩色图像。注释有一个 [] （没有理由，那是一个特殊情况）。

最后，我们可以将 tf.string 数据类型视为字符串。实际上，在评估了注释张量之后，我们可以意识到张量被视为一个binary string（这才是tensorflow中真正的处理方式）。因此，在我的例子中，该字符串只是与该特定图像相关的一组特征。因此，为了提取特定的特征，这里举个例子：

# The output of string_split is not a tensor, instead, it is a SparseTensorValue. Therefore, it has a property value that stores the actual values. as a tensor. 
label_batch_splitted = tf.string_split(label_batch, delimiter=', ')
label_batch_values = tf.reshape(label_batch_splitted.values, [batch_size, -1])
# string_to_number will convert the feature's numbers into float32 as I need them. 
label_batch_numbers = tf.string_to_number(label_batch_values, out_type=tf.float32)
# the tf.slice would extract the necessary feature which I am looking.
confidences = tf.slice(label_batch_numbers, begin=[0, 3], size=[-1, 1])

希望这个回答对您有所帮助。

从字符串列表创建 TfRecords 并在解码后在张量流中提供图形

Creating TfRecords from a list of strings and feeding a Graph in tensorflow after decoding

python

binary

batch-processing

string-decoding

tensorflow