如何将 jpeg 图像目录转换为 tensorflow 中的 TFRecords 文件?
How do I convert a directory of jpeg images to TFRecords file in tensorflow?
我有一个训练数据,它是一个 jpeg 图像目录和一个包含文件名和相关类别标签的相应文本文件。我正在尝试将此训练数据转换为 tfrecords 文件,如 tensorflow 文档中所述。我花了很多时间试图让它工作,但在 tensorflow 中没有示例演示如何使用任何阅读器读取 jpeg 文件并使用 tfrecordwriter
将它们添加到 tfrecord
希望对您有所帮助:
filename_queue = tf.train.string_input_producer(['/Users/HANEL/Desktop/tf.png']) # list of files to read
reader = tf.WholeFileReader()
key, value = reader.read(filename_queue)
my_img = tf.image.decode_png(value) # use decode_png or decode_jpeg decoder based on your files.
init_op = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init_op)
# Start populating the filename queue.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
for i in range(1): #length of your filename list
image = my_img.eval() #here is your image Tensor :)
print(image.shape)
Image.show(Image.fromarray(np.asarray(image)))
coord.request_stop()
coord.join(threads)
要将所有图像作为张量数组获取,请使用以下代码示例。
更新:
在之前的回答中我只是说了如何读取TF格式的图片,但没有将其保存在TFRecords中。为此你应该使用:
def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def _bytes_feature(value):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
# images and labels array as input
def convert_to(images, labels, name):
num_examples = labels.shape[0]
if images.shape[0] != num_examples:
raise ValueError("Images size %d does not match label size %d." %
(images.shape[0], num_examples))
rows = images.shape[1]
cols = images.shape[2]
depth = images.shape[3]
filename = os.path.join(FLAGS.directory, name + '.tfrecords')
print('Writing', filename)
writer = tf.python_io.TFRecordWriter(filename)
for index in range(num_examples):
image_raw = images[index].tostring()
example = tf.train.Example(features=tf.train.Features(feature={
'height': _int64_feature(rows),
'width': _int64_feature(cols),
'depth': _int64_feature(depth),
'label': _int64_feature(int(labels[index])),
'image_raw': _bytes_feature(image_raw)}))
writer.write(example.SerializeToString())
更多信息here
然后你这样读取数据:
# Remember to generate a file name queue of you 'train.TFRecord' file path
def read_and_decode(filename_queue):
reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)
features = tf.parse_single_example(
serialized_example,
dense_keys=['image_raw', 'label'],
# Defaults are not specified since both keys are required.
dense_types=[tf.string, tf.int64])
# Convert from a scalar string tensor (whose single string has
image = tf.decode_raw(features['image_raw'], tf.uint8)
image = tf.reshape(image, [my_cifar.n_input])
image.set_shape([my_cifar.n_input])
# OPTIONAL: Could reshape into a 28x28 image and apply distortions
# here. Since we are not applying any distortions in this
# example, and the next step expects the image to be flattened
# into a vector, we don't bother.
# Convert from [0, 255] -> [-0.5, 0.5] floats.
image = tf.cast(image, tf.float32)
image = tf.cast(image, tf.float32) * (1. / 255) - 0.5
# Convert label from a scalar uint8 tensor to an int32 scalar.
label = tf.cast(features['label'], tf.int32)
return image, label
Tensorflow 的初始模型有一个文件build_image_data.py,假设每个子目录代表一个标签,它可以完成同样的事情。
我也有同样的问题
这就是我如何获取我自己的 jpeg 文件的 tfrecords 文件
编辑: 添加 sol 1 - 更好更快的方法
更新:Jan/5/2020
(推荐)方案一:TFRecordWriter
看到这个Tfrecords Guidepost
方案二:
来自tensorflow官方github:How to Construct a New Dataset for Retraining, use official python script build_image_data.py directly and bazel是一个更好的主意。
这是说明:
To run build_image_data.py
, you can run the following command line:
# location to where to save the TFRecord data.
OUTPUT_DIRECTORY=$HOME/my-custom-data/
# build the preprocessing script.
bazel build inception/build_image_data
# convert the data.
bazel-bin/inception/build_image_data \
--train_directory="${TRAIN_DIR}" \
--validation_directory="${VALIDATION_DIR}" \
--output_directory="${OUTPUT_DIRECTORY}" \
--labels_file="${LABELS_FILE}" \
--train_shards=128 \
--validation_shards=24 \
--num_threads=8
where the $OUTPUT_DIRECTORY
is the location of the sharded
TFRecords
. The $LABELS_FILE
will be a text file that is read by
the script that provides a list of all of the labels.
那么,它应该可以解决问题。
ps。 Google制作的bazel,把代码转成makefile.
解决方案 3:
首先,我参考了@capitalistpug 的说明并检查了shell 脚本文件
(shell 由 Google 提供的脚本文件: download_and_preprocess_flowers.sh)
其次,我还找到了NVIDIA的一个mini inception-v3训练教程
(NVIDIA官方SPEED UP TRAINING WITH GPU-ACCELERATED TENSORFLOW)
注意,下面的步骤ps需要在Bazel WORKSAPCE环境下执行
所以 Bazel 构建文件可以 运行 成功
第一步,我注释掉我已经下载的imagenet数据集的部分
以及我不需要的其余部分 download_and_preprocess_flowers.sh
第二步,切换目录到tensorflow/models/inception
这里是Bazel环境,之前是Bazel构建的
$ cd tensorflow/models/inception
可选:如果之前没有构建,在cmd中输入以下代码
$ bazel build inception/download_and_preprocess_flowers
下图中的内容你需要自己猜
最后一步,输入以下代码:
$ bazel-bin/inception/download_and_preprocess_flowers $Your/own/image/data/path
然后,它将开始调用 build_image_data.py 并创建 tfrecords 文件
请注意,图像将作为未压缩的张量保存在 TFRecord 中,可能会将大小增加大约 5 倍。这会浪费存储空间 space,并且由于需要处理的数据量,速度可能会相当慢需要阅读。
最好只将文件名保存在 TFRecord 中,然后按需读取文件。新的 Dataset
API 效果很好, the documentation 有这个例子:
# Reads an image from a file, decodes it into a dense tensor, and resizes it
# to a fixed shape.
def _parse_function(filename, label):
image_string = tf.read_file(filename)
image_decoded = tf.image.decode_jpeg(image_string)
image_resized = tf.image.resize_images(image_decoded, [28, 28])
return image_resized, label
# A vector of filenames.
filenames = tf.constant(["/var/data/image1.jpg", "/var/data/image2.jpg", ...])
# `labels[i]` is the label for the image in `filenames[i].
labels = tf.constant([0, 37, ...])
dataset = tf.data.Dataset.from_tensor_slices((filenames, labels))
dataset = dataset.map(_parse_function)
在Kamil指定的Link中提及代码,这样即使Link被破坏,代码也可用。
"""Converts image data to TFRecords file format with Example protos.
If your data set involves bounding boxes, please look at build_imagenet_data.py.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from datetime import datetime
import os
import random
import sys
import threading
import numpy as np
import tensorflow as tf
tf.app.flags.DEFINE_string('train_directory', '/tmp/',
'Training data directory')
tf.app.flags.DEFINE_string('validation_directory', '/tmp/',
'Validation data directory')
tf.app.flags.DEFINE_string('output_directory', '/tmp/',
'Output data directory')
tf.app.flags.DEFINE_integer('train_shards', 2,
'Number of shards in training TFRecord files.')
tf.app.flags.DEFINE_integer('validation_shards', 2,
'Number of shards in validation TFRecord files.')
tf.app.flags.DEFINE_integer('num_threads', 2,
'Number of threads to preprocess the images.')
# The labels file contains a list of valid labels are held in this file.
# Assumes that the file contains entries as such:
# dog
# cat
# flower
# where each line corresponds to a label. We map each label contained in
# the file to an integer corresponding to the line number starting from 0.
tf.app.flags.DEFINE_string('labels_file', '', 'Labels file')
FLAGS = tf.app.flags.FLAGS
def _int64_feature(value):
"""Wrapper for inserting int64 features into Example proto."""
if not isinstance(value, list):
value = [value]
return tf.train.Feature(int64_list=tf.train.Int64List(value=value))
def _bytes_feature(value):
"""Wrapper for inserting bytes features into Example proto."""
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
def _convert_to_example(filename, image_buffer, label, text, height, width):
"""Build an Example proto for an example.
Args:
filename: string, path to an image file, e.g., '/path/to/example.JPG'
image_buffer: string, JPEG encoding of RGB image
label: integer, identifier for the ground truth for the network
text: string, unique human-readable, e.g. 'dog'
height: integer, image height in pixels
width: integer, image width in pixels
Returns:
Example proto
"""
colorspace = 'RGB'
channels = 3
image_format = 'JPEG'
example = tf.train.Example(features=tf.train.Features(feature={
'image/height': _int64_feature(height),
'image/width': _int64_feature(width),
'image/colorspace': _bytes_feature(tf.compat.as_bytes(colorspace)),
'image/channels': _int64_feature(channels),
'image/class/label': _int64_feature(label),
'image/class/text': _bytes_feature(tf.compat.as_bytes(text)),
'image/format': _bytes_feature(tf.compat.as_bytes(image_format)),
'image/filename': _bytes_feature(tf.compat.as_bytes(os.path.basename(filename))),
'image/encoded': _bytes_feature(tf.compat.as_bytes(image_buffer))}))
return example
class ImageCoder(object):
"""Helper class that provides TensorFlow image coding utilities."""
def __init__(self):
# Create a single Session to run all image coding calls.
self._sess = tf.Session()
# Initializes function that converts PNG to JPEG data.
self._png_data = tf.placeholder(dtype=tf.string)
image = tf.image.decode_png(self._png_data, channels=3)
self._png_to_jpeg = tf.image.encode_jpeg(image, format='rgb', quality=100)
# Initializes function that decodes RGB JPEG data.
self._decode_jpeg_data = tf.placeholder(dtype=tf.string)
self._decode_jpeg = tf.image.decode_jpeg(self._decode_jpeg_data, channels=3)
def png_to_jpeg(self, image_data):
return self._sess.run(self._png_to_jpeg,
feed_dict={self._png_data: image_data})
def decode_jpeg(self, image_data):
image = self._sess.run(self._decode_jpeg,
feed_dict={self._decode_jpeg_data: image_data})
assert len(image.shape) == 3
assert image.shape[2] == 3
return image
def _is_png(filename):
"""Determine if a file contains a PNG format image.
Args:
filename: string, path of the image file.
Returns:
boolean indicating if the image is a PNG.
"""
return '.png' in filename
def _process_image(filename, coder):
"""Process a single image file.
Args:
filename: string, path to an image file e.g., '/path/to/example.JPG'.
coder: instance of ImageCoder to provide TensorFlow image coding utils.
Returns:
image_buffer: string, JPEG encoding of RGB image.
height: integer, image height in pixels.
width: integer, image width in pixels.
"""
# Read the image file.
with tf.gfile.FastGFile(filename, 'rb') as f:
image_data = f.read()
# Convert any PNG to JPEG's for consistency.
if _is_png(filename):
print('Converting PNG to JPEG for %s' % filename)
image_data = coder.png_to_jpeg(image_data)
# Decode the RGB JPEG.
image = coder.decode_jpeg(image_data)
# Check that image converted to RGB
assert len(image.shape) == 3
height = image.shape[0]
width = image.shape[1]
assert image.shape[2] == 3
return image_data, height, width
def _process_image_files_batch(coder, thread_index, ranges, name, filenames,
texts, labels, num_shards):
"""Processes and saves list of images as TFRecord in 1 thread.
Args:
coder: instance of ImageCoder to provide TensorFlow image coding utils.
thread_index: integer, unique batch to run index is within [0, len(ranges)).
ranges: list of pairs of integers specifying ranges of each batches to
analyze in parallel.
name: string, unique identifier specifying the data set
filenames: list of strings; each string is a path to an image file
texts: list of strings; each string is human readable, e.g. 'dog'
labels: list of integer; each integer identifies the ground truth
num_shards: integer number of shards for this data set.
"""
# Each thread produces N shards where N = int(num_shards / num_threads).
# For instance, if num_shards = 128, and the num_threads = 2, then the first
# thread would produce shards [0, 64).
num_threads = len(ranges)
assert not num_shards % num_threads
num_shards_per_batch = int(num_shards / num_threads)
shard_ranges = np.linspace(ranges[thread_index][0],
ranges[thread_index][1],
num_shards_per_batch + 1).astype(int)
num_files_in_thread = ranges[thread_index][1] - ranges[thread_index][0]
counter = 0
for s in range(num_shards_per_batch):
# Generate a sharded version of the file name, e.g. 'train-00002-of-00010'
shard = thread_index * num_shards_per_batch + s
output_filename = '%s-%.5d-of-%.5d' % (name, shard, num_shards)
output_file = os.path.join(FLAGS.output_directory, output_filename)
writer = tf.python_io.TFRecordWriter(output_file)
shard_counter = 0
files_in_shard = np.arange(shard_ranges[s], shard_ranges[s + 1], dtype=int)
for i in files_in_shard:
filename = filenames[i]
label = labels[i]
text = texts[i]
try:
image_buffer, height, width = _process_image(filename, coder)
except Exception as e:
print(e)
print('SKIPPED: Unexpected eror while decoding %s.' % filename)
continue
example = _convert_to_example(filename, image_buffer, label,
text, height, width)
writer.write(example.SerializeToString())
shard_counter += 1
counter += 1
if not counter % 1000:
print('%s [thread %d]: Processed %d of %d images in thread batch.' %
(datetime.now(), thread_index, counter, num_files_in_thread))
sys.stdout.flush()
writer.close()
print('%s [thread %d]: Wrote %d images to %s' %
(datetime.now(), thread_index, shard_counter, output_file))
sys.stdout.flush()
shard_counter = 0
print('%s [thread %d]: Wrote %d images to %d shards.' %
(datetime.now(), thread_index, counter, num_files_in_thread))
sys.stdout.flush()
def _process_image_files(name, filenames, texts, labels, num_shards):
"""Process and save list of images as TFRecord of Example protos.
Args:
name: string, unique identifier specifying the data set
filenames: list of strings; each string is a path to an image file
texts: list of strings; each string is human readable, e.g. 'dog'
labels: list of integer; each integer identifies the ground truth
num_shards: integer number of shards for this data set.
"""
assert len(filenames) == len(texts)
assert len(filenames) == len(labels)
# Break all images into batches with a [ranges[i][0], ranges[i][1]].
spacing = np.linspace(0, len(filenames), FLAGS.num_threads + 1).astype(np.int)
ranges = []
for i in range(len(spacing) - 1):
ranges.append([spacing[i], spacing[i + 1]])
# Launch a thread for each batch.
print('Launching %d threads for spacings: %s' % (FLAGS.num_threads, ranges))
sys.stdout.flush()
# Create a mechanism for monitoring when all threads are finished.
coord = tf.train.Coordinator()
# Create a generic TensorFlow-based utility for converting all image codings.
coder = ImageCoder()
threads = []
for thread_index in range(len(ranges)):
args = (coder, thread_index, ranges, name, filenames,
texts, labels, num_shards)
t = threading.Thread(target=_process_image_files_batch, args=args)
t.start()
threads.append(t)
# Wait for all the threads to terminate.
coord.join(threads)
print('%s: Finished writing all %d images in data set.' %
(datetime.now(), len(filenames)))
sys.stdout.flush()
def _find_image_files(data_dir, labels_file):
"""Build a list of all images files and labels in the data set.
Args:
data_dir: string, path to the root directory of images.
Assumes that the image data set resides in JPEG files located in
the following directory structure.
data_dir/dog/another-image.JPEG
data_dir/dog/my-image.jpg
where 'dog' is the label associated with these images.
labels_file: string, path to the labels file.
The list of valid labels are held in this file. Assumes that the file
contains entries as such:
dog
cat
flower
where each line corresponds to a label. We map each label contained in
the file to an integer starting with the integer 0 corresponding to the
label contained in the first line.
Returns:
filenames: list of strings; each string is a path to an image file.
texts: list of strings; each string is the class, e.g. 'dog'
labels: list of integer; each integer identifies the ground truth.
"""
print('Determining list of input files and labels from %s.' % data_dir)
unique_labels = [l.strip() for l in tf.gfile.FastGFile(
labels_file, 'r').readlines()]
labels = []
filenames = []
texts = []
# Leave label index 0 empty as a background class.
label_index = 1
# Construct the list of JPEG files and labels.
for text in unique_labels:
jpeg_file_path = '%s/%s/*' % (data_dir, text)
matching_files = tf.gfile.Glob(jpeg_file_path)
labels.extend([label_index] * len(matching_files))
texts.extend([text] * len(matching_files))
filenames.extend(matching_files)
if not label_index % 100:
print('Finished finding files in %d of %d classes.' % (
label_index, len(labels)))
label_index += 1
# Shuffle the ordering of all image files in order to guarantee
# random ordering of the images with respect to label in the
# saved TFRecord files. Make the randomization repeatable.
shuffled_index = list(range(len(filenames)))
random.seed(12345)
random.shuffle(shuffled_index)
filenames = [filenames[i] for i in shuffled_index]
texts = [texts[i] for i in shuffled_index]
labels = [labels[i] for i in shuffled_index]
print('Found %d JPEG files across %d labels inside %s.' %
(len(filenames), len(unique_labels), data_dir))
return filenames, texts, labels
def _process_dataset(name, directory, num_shards, labels_file):
"""Process a complete data set and save it as a TFRecord.
Args:
name: string, unique identifier specifying the data set.
directory: string, root path to the data set.
num_shards: integer number of shards for this data set.
labels_file: string, path to the labels file.
"""
filenames, texts, labels = _find_image_files(directory, labels_file)
_process_image_files(name, filenames, texts, labels, num_shards)
def main(unused_argv):
assert not FLAGS.train_shards % FLAGS.num_threads, (
'Please make the FLAGS.num_threads commensurate with FLAGS.train_shards')
assert not FLAGS.validation_shards % FLAGS.num_threads, (
'Please make the FLAGS.num_threads commensurate with '
'FLAGS.validation_shards')
print('Saving results to %s' % FLAGS.output_directory)
# Run it!
_process_dataset('validation', FLAGS.validation_directory,
FLAGS.validation_shards, FLAGS.labels_file)
_process_dataset('train', FLAGS.train_directory,
FLAGS.train_shards, FLAGS.labels_file)
if __name__ == '__main__':
tf.app.run()
如果您使用直接读取字节的 tfrecord 文件太大。
这个link显示了它。
你用这个函数直接读取字节
img_bytes = open(path,'rb').read()
引用
您可以在此处使用 Kubeflow 管道进行转换:
https://aihub.cloud.google.com/u/0/p/products%2Fded3e5e5-d2e8-4d65-9b9f-5ffaa9a27ea1
点击下载link(创建一个Kubeflow集群到运行管道)
试试这个脚本:
(与VOC分割数据集一起使用:http://host.robots.ox.ac.uk/pascal/VOC/voc2012/)
import numpy as np
import tensorflow as tf
import scipy.io # to read .mat files
from PIL import Image # to read image files
def get_image(path):
jpg = Image.open(path).convert('RGB')
return np.array(jpg)
def get_label_png(path):
png = Image.open(path) # image is saved as palettised png.
arr = np.array(png)
return arr[..., None]
def get_example(image, label):
feature = {
'height': tf.train.Feature(int64_list=tf.train.Int64List(value=[image.shape[0]])),
'width': tf.train.Feature(int64_list=tf.train.Int64List(value=[image.shape[1]])),
'image': tf.train.Feature(bytes_list=tf.train.BytesList(value=[image.tobytes()])),
'label': tf.train.Feature(bytes_list=tf.train.BytesList(value=[label.tobytes()]))
}
return tf.train.Example(features=tf.train.Features(feature=feature))
## Paths ======================================
images_folder = 'data/images/' #images folder
labels_folder = 'data/labels/' #label folder
train_file = 'data/train.txt'
val_file = 'data/val.txt'
TRAIN = 'data/train.tfrecords'
VAL = 'data/val.tfrecords'
## write train dataset
with tf.io.TFRecordWriter(TRAIN) as writer:
with open(train_file) as file:
filenames = [s.rstrip('\n') for s in file.readlines()]
for name in filenames:
image = utils.get_image(images_folder+name+'.jpg')
label = utils.get_label_png(labels_folder+name+'.png')
writer.write(utils.get_example(image, label).SerializeToString())
## write validation dataset
with tf.io.TFRecordWriter(VAL) as writer:
with open(val_file) as file:
filenames = [s.rstrip('\n') for s in file.readlines()]
for name in filenames:
image = utils.get_image(images_folder+name+'.jpg')
label = utils.get_label_png(labels_folder+name+'.png')
writer.write(utils.get_example(image, label).SerializeToString())
我有一个训练数据,它是一个 jpeg 图像目录和一个包含文件名和相关类别标签的相应文本文件。我正在尝试将此训练数据转换为 tfrecords 文件,如 tensorflow 文档中所述。我花了很多时间试图让它工作,但在 tensorflow 中没有示例演示如何使用任何阅读器读取 jpeg 文件并使用 tfrecordwriter
将它们添加到 tfrecord希望对您有所帮助:
filename_queue = tf.train.string_input_producer(['/Users/HANEL/Desktop/tf.png']) # list of files to read
reader = tf.WholeFileReader()
key, value = reader.read(filename_queue)
my_img = tf.image.decode_png(value) # use decode_png or decode_jpeg decoder based on your files.
init_op = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init_op)
# Start populating the filename queue.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
for i in range(1): #length of your filename list
image = my_img.eval() #here is your image Tensor :)
print(image.shape)
Image.show(Image.fromarray(np.asarray(image)))
coord.request_stop()
coord.join(threads)
要将所有图像作为张量数组获取,请使用以下代码示例。
更新:
在之前的回答中我只是说了如何读取TF格式的图片,但没有将其保存在TFRecords中。为此你应该使用:
def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def _bytes_feature(value):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
# images and labels array as input
def convert_to(images, labels, name):
num_examples = labels.shape[0]
if images.shape[0] != num_examples:
raise ValueError("Images size %d does not match label size %d." %
(images.shape[0], num_examples))
rows = images.shape[1]
cols = images.shape[2]
depth = images.shape[3]
filename = os.path.join(FLAGS.directory, name + '.tfrecords')
print('Writing', filename)
writer = tf.python_io.TFRecordWriter(filename)
for index in range(num_examples):
image_raw = images[index].tostring()
example = tf.train.Example(features=tf.train.Features(feature={
'height': _int64_feature(rows),
'width': _int64_feature(cols),
'depth': _int64_feature(depth),
'label': _int64_feature(int(labels[index])),
'image_raw': _bytes_feature(image_raw)}))
writer.write(example.SerializeToString())
更多信息here
然后你这样读取数据:
# Remember to generate a file name queue of you 'train.TFRecord' file path
def read_and_decode(filename_queue):
reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)
features = tf.parse_single_example(
serialized_example,
dense_keys=['image_raw', 'label'],
# Defaults are not specified since both keys are required.
dense_types=[tf.string, tf.int64])
# Convert from a scalar string tensor (whose single string has
image = tf.decode_raw(features['image_raw'], tf.uint8)
image = tf.reshape(image, [my_cifar.n_input])
image.set_shape([my_cifar.n_input])
# OPTIONAL: Could reshape into a 28x28 image and apply distortions
# here. Since we are not applying any distortions in this
# example, and the next step expects the image to be flattened
# into a vector, we don't bother.
# Convert from [0, 255] -> [-0.5, 0.5] floats.
image = tf.cast(image, tf.float32)
image = tf.cast(image, tf.float32) * (1. / 255) - 0.5
# Convert label from a scalar uint8 tensor to an int32 scalar.
label = tf.cast(features['label'], tf.int32)
return image, label
Tensorflow 的初始模型有一个文件build_image_data.py,假设每个子目录代表一个标签,它可以完成同样的事情。
我也有同样的问题
这就是我如何获取我自己的 jpeg 文件的 tfrecords 文件
编辑: 添加 sol 1 - 更好更快的方法 更新:Jan/5/2020
(推荐)方案一:TFRecordWriter
看到这个Tfrecords Guidepost
方案二:
来自tensorflow官方github:How to Construct a New Dataset for Retraining, use official python script build_image_data.py directly and bazel是一个更好的主意。
这是说明:
To run
build_image_data.py
, you can run the following command line:# location to where to save the TFRecord data. OUTPUT_DIRECTORY=$HOME/my-custom-data/ # build the preprocessing script. bazel build inception/build_image_data # convert the data. bazel-bin/inception/build_image_data \ --train_directory="${TRAIN_DIR}" \ --validation_directory="${VALIDATION_DIR}" \ --output_directory="${OUTPUT_DIRECTORY}" \ --labels_file="${LABELS_FILE}" \ --train_shards=128 \ --validation_shards=24 \ --num_threads=8
where the
$OUTPUT_DIRECTORY
is the location of the shardedTFRecords
. The$LABELS_FILE
will be a text file that is read by the script that provides a list of all of the labels.
那么,它应该可以解决问题。
ps。 Google制作的bazel,把代码转成makefile.
解决方案 3:
首先,我参考了@capitalistpug 的说明并检查了shell 脚本文件
(shell 由 Google 提供的脚本文件: download_and_preprocess_flowers.sh)
其次,我还找到了NVIDIA的一个mini inception-v3训练教程
(NVIDIA官方SPEED UP TRAINING WITH GPU-ACCELERATED TENSORFLOW)
注意,下面的步骤ps需要在Bazel WORKSAPCE环境下执行
所以 Bazel 构建文件可以 运行 成功
第一步,我注释掉我已经下载的imagenet数据集的部分
以及我不需要的其余部分 download_and_preprocess_flowers.sh
第二步,切换目录到tensorflow/models/inception
这里是Bazel环境,之前是Bazel构建的
$ cd tensorflow/models/inception
可选:如果之前没有构建,在cmd中输入以下代码
$ bazel build inception/download_and_preprocess_flowers
下图中的内容你需要自己猜
最后一步,输入以下代码:
$ bazel-bin/inception/download_and_preprocess_flowers $Your/own/image/data/path
然后,它将开始调用 build_image_data.py 并创建 tfrecords 文件
请注意,图像将作为未压缩的张量保存在 TFRecord 中,可能会将大小增加大约 5 倍。这会浪费存储空间 space,并且由于需要处理的数据量,速度可能会相当慢需要阅读。
最好只将文件名保存在 TFRecord 中,然后按需读取文件。新的 Dataset
API 效果很好, the documentation 有这个例子:
# Reads an image from a file, decodes it into a dense tensor, and resizes it
# to a fixed shape.
def _parse_function(filename, label):
image_string = tf.read_file(filename)
image_decoded = tf.image.decode_jpeg(image_string)
image_resized = tf.image.resize_images(image_decoded, [28, 28])
return image_resized, label
# A vector of filenames.
filenames = tf.constant(["/var/data/image1.jpg", "/var/data/image2.jpg", ...])
# `labels[i]` is the label for the image in `filenames[i].
labels = tf.constant([0, 37, ...])
dataset = tf.data.Dataset.from_tensor_slices((filenames, labels))
dataset = dataset.map(_parse_function)
在Kamil指定的Link中提及代码,这样即使Link被破坏,代码也可用。
"""Converts image data to TFRecords file format with Example protos.
If your data set involves bounding boxes, please look at build_imagenet_data.py.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from datetime import datetime
import os
import random
import sys
import threading
import numpy as np
import tensorflow as tf
tf.app.flags.DEFINE_string('train_directory', '/tmp/',
'Training data directory')
tf.app.flags.DEFINE_string('validation_directory', '/tmp/',
'Validation data directory')
tf.app.flags.DEFINE_string('output_directory', '/tmp/',
'Output data directory')
tf.app.flags.DEFINE_integer('train_shards', 2,
'Number of shards in training TFRecord files.')
tf.app.flags.DEFINE_integer('validation_shards', 2,
'Number of shards in validation TFRecord files.')
tf.app.flags.DEFINE_integer('num_threads', 2,
'Number of threads to preprocess the images.')
# The labels file contains a list of valid labels are held in this file.
# Assumes that the file contains entries as such:
# dog
# cat
# flower
# where each line corresponds to a label. We map each label contained in
# the file to an integer corresponding to the line number starting from 0.
tf.app.flags.DEFINE_string('labels_file', '', 'Labels file')
FLAGS = tf.app.flags.FLAGS
def _int64_feature(value):
"""Wrapper for inserting int64 features into Example proto."""
if not isinstance(value, list):
value = [value]
return tf.train.Feature(int64_list=tf.train.Int64List(value=value))
def _bytes_feature(value):
"""Wrapper for inserting bytes features into Example proto."""
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
def _convert_to_example(filename, image_buffer, label, text, height, width):
"""Build an Example proto for an example.
Args:
filename: string, path to an image file, e.g., '/path/to/example.JPG'
image_buffer: string, JPEG encoding of RGB image
label: integer, identifier for the ground truth for the network
text: string, unique human-readable, e.g. 'dog'
height: integer, image height in pixels
width: integer, image width in pixels
Returns:
Example proto
"""
colorspace = 'RGB'
channels = 3
image_format = 'JPEG'
example = tf.train.Example(features=tf.train.Features(feature={
'image/height': _int64_feature(height),
'image/width': _int64_feature(width),
'image/colorspace': _bytes_feature(tf.compat.as_bytes(colorspace)),
'image/channels': _int64_feature(channels),
'image/class/label': _int64_feature(label),
'image/class/text': _bytes_feature(tf.compat.as_bytes(text)),
'image/format': _bytes_feature(tf.compat.as_bytes(image_format)),
'image/filename': _bytes_feature(tf.compat.as_bytes(os.path.basename(filename))),
'image/encoded': _bytes_feature(tf.compat.as_bytes(image_buffer))}))
return example
class ImageCoder(object):
"""Helper class that provides TensorFlow image coding utilities."""
def __init__(self):
# Create a single Session to run all image coding calls.
self._sess = tf.Session()
# Initializes function that converts PNG to JPEG data.
self._png_data = tf.placeholder(dtype=tf.string)
image = tf.image.decode_png(self._png_data, channels=3)
self._png_to_jpeg = tf.image.encode_jpeg(image, format='rgb', quality=100)
# Initializes function that decodes RGB JPEG data.
self._decode_jpeg_data = tf.placeholder(dtype=tf.string)
self._decode_jpeg = tf.image.decode_jpeg(self._decode_jpeg_data, channels=3)
def png_to_jpeg(self, image_data):
return self._sess.run(self._png_to_jpeg,
feed_dict={self._png_data: image_data})
def decode_jpeg(self, image_data):
image = self._sess.run(self._decode_jpeg,
feed_dict={self._decode_jpeg_data: image_data})
assert len(image.shape) == 3
assert image.shape[2] == 3
return image
def _is_png(filename):
"""Determine if a file contains a PNG format image.
Args:
filename: string, path of the image file.
Returns:
boolean indicating if the image is a PNG.
"""
return '.png' in filename
def _process_image(filename, coder):
"""Process a single image file.
Args:
filename: string, path to an image file e.g., '/path/to/example.JPG'.
coder: instance of ImageCoder to provide TensorFlow image coding utils.
Returns:
image_buffer: string, JPEG encoding of RGB image.
height: integer, image height in pixels.
width: integer, image width in pixels.
"""
# Read the image file.
with tf.gfile.FastGFile(filename, 'rb') as f:
image_data = f.read()
# Convert any PNG to JPEG's for consistency.
if _is_png(filename):
print('Converting PNG to JPEG for %s' % filename)
image_data = coder.png_to_jpeg(image_data)
# Decode the RGB JPEG.
image = coder.decode_jpeg(image_data)
# Check that image converted to RGB
assert len(image.shape) == 3
height = image.shape[0]
width = image.shape[1]
assert image.shape[2] == 3
return image_data, height, width
def _process_image_files_batch(coder, thread_index, ranges, name, filenames,
texts, labels, num_shards):
"""Processes and saves list of images as TFRecord in 1 thread.
Args:
coder: instance of ImageCoder to provide TensorFlow image coding utils.
thread_index: integer, unique batch to run index is within [0, len(ranges)).
ranges: list of pairs of integers specifying ranges of each batches to
analyze in parallel.
name: string, unique identifier specifying the data set
filenames: list of strings; each string is a path to an image file
texts: list of strings; each string is human readable, e.g. 'dog'
labels: list of integer; each integer identifies the ground truth
num_shards: integer number of shards for this data set.
"""
# Each thread produces N shards where N = int(num_shards / num_threads).
# For instance, if num_shards = 128, and the num_threads = 2, then the first
# thread would produce shards [0, 64).
num_threads = len(ranges)
assert not num_shards % num_threads
num_shards_per_batch = int(num_shards / num_threads)
shard_ranges = np.linspace(ranges[thread_index][0],
ranges[thread_index][1],
num_shards_per_batch + 1).astype(int)
num_files_in_thread = ranges[thread_index][1] - ranges[thread_index][0]
counter = 0
for s in range(num_shards_per_batch):
# Generate a sharded version of the file name, e.g. 'train-00002-of-00010'
shard = thread_index * num_shards_per_batch + s
output_filename = '%s-%.5d-of-%.5d' % (name, shard, num_shards)
output_file = os.path.join(FLAGS.output_directory, output_filename)
writer = tf.python_io.TFRecordWriter(output_file)
shard_counter = 0
files_in_shard = np.arange(shard_ranges[s], shard_ranges[s + 1], dtype=int)
for i in files_in_shard:
filename = filenames[i]
label = labels[i]
text = texts[i]
try:
image_buffer, height, width = _process_image(filename, coder)
except Exception as e:
print(e)
print('SKIPPED: Unexpected eror while decoding %s.' % filename)
continue
example = _convert_to_example(filename, image_buffer, label,
text, height, width)
writer.write(example.SerializeToString())
shard_counter += 1
counter += 1
if not counter % 1000:
print('%s [thread %d]: Processed %d of %d images in thread batch.' %
(datetime.now(), thread_index, counter, num_files_in_thread))
sys.stdout.flush()
writer.close()
print('%s [thread %d]: Wrote %d images to %s' %
(datetime.now(), thread_index, shard_counter, output_file))
sys.stdout.flush()
shard_counter = 0
print('%s [thread %d]: Wrote %d images to %d shards.' %
(datetime.now(), thread_index, counter, num_files_in_thread))
sys.stdout.flush()
def _process_image_files(name, filenames, texts, labels, num_shards):
"""Process and save list of images as TFRecord of Example protos.
Args:
name: string, unique identifier specifying the data set
filenames: list of strings; each string is a path to an image file
texts: list of strings; each string is human readable, e.g. 'dog'
labels: list of integer; each integer identifies the ground truth
num_shards: integer number of shards for this data set.
"""
assert len(filenames) == len(texts)
assert len(filenames) == len(labels)
# Break all images into batches with a [ranges[i][0], ranges[i][1]].
spacing = np.linspace(0, len(filenames), FLAGS.num_threads + 1).astype(np.int)
ranges = []
for i in range(len(spacing) - 1):
ranges.append([spacing[i], spacing[i + 1]])
# Launch a thread for each batch.
print('Launching %d threads for spacings: %s' % (FLAGS.num_threads, ranges))
sys.stdout.flush()
# Create a mechanism for monitoring when all threads are finished.
coord = tf.train.Coordinator()
# Create a generic TensorFlow-based utility for converting all image codings.
coder = ImageCoder()
threads = []
for thread_index in range(len(ranges)):
args = (coder, thread_index, ranges, name, filenames,
texts, labels, num_shards)
t = threading.Thread(target=_process_image_files_batch, args=args)
t.start()
threads.append(t)
# Wait for all the threads to terminate.
coord.join(threads)
print('%s: Finished writing all %d images in data set.' %
(datetime.now(), len(filenames)))
sys.stdout.flush()
def _find_image_files(data_dir, labels_file):
"""Build a list of all images files and labels in the data set.
Args:
data_dir: string, path to the root directory of images.
Assumes that the image data set resides in JPEG files located in
the following directory structure.
data_dir/dog/another-image.JPEG
data_dir/dog/my-image.jpg
where 'dog' is the label associated with these images.
labels_file: string, path to the labels file.
The list of valid labels are held in this file. Assumes that the file
contains entries as such:
dog
cat
flower
where each line corresponds to a label. We map each label contained in
the file to an integer starting with the integer 0 corresponding to the
label contained in the first line.
Returns:
filenames: list of strings; each string is a path to an image file.
texts: list of strings; each string is the class, e.g. 'dog'
labels: list of integer; each integer identifies the ground truth.
"""
print('Determining list of input files and labels from %s.' % data_dir)
unique_labels = [l.strip() for l in tf.gfile.FastGFile(
labels_file, 'r').readlines()]
labels = []
filenames = []
texts = []
# Leave label index 0 empty as a background class.
label_index = 1
# Construct the list of JPEG files and labels.
for text in unique_labels:
jpeg_file_path = '%s/%s/*' % (data_dir, text)
matching_files = tf.gfile.Glob(jpeg_file_path)
labels.extend([label_index] * len(matching_files))
texts.extend([text] * len(matching_files))
filenames.extend(matching_files)
if not label_index % 100:
print('Finished finding files in %d of %d classes.' % (
label_index, len(labels)))
label_index += 1
# Shuffle the ordering of all image files in order to guarantee
# random ordering of the images with respect to label in the
# saved TFRecord files. Make the randomization repeatable.
shuffled_index = list(range(len(filenames)))
random.seed(12345)
random.shuffle(shuffled_index)
filenames = [filenames[i] for i in shuffled_index]
texts = [texts[i] for i in shuffled_index]
labels = [labels[i] for i in shuffled_index]
print('Found %d JPEG files across %d labels inside %s.' %
(len(filenames), len(unique_labels), data_dir))
return filenames, texts, labels
def _process_dataset(name, directory, num_shards, labels_file):
"""Process a complete data set and save it as a TFRecord.
Args:
name: string, unique identifier specifying the data set.
directory: string, root path to the data set.
num_shards: integer number of shards for this data set.
labels_file: string, path to the labels file.
"""
filenames, texts, labels = _find_image_files(directory, labels_file)
_process_image_files(name, filenames, texts, labels, num_shards)
def main(unused_argv):
assert not FLAGS.train_shards % FLAGS.num_threads, (
'Please make the FLAGS.num_threads commensurate with FLAGS.train_shards')
assert not FLAGS.validation_shards % FLAGS.num_threads, (
'Please make the FLAGS.num_threads commensurate with '
'FLAGS.validation_shards')
print('Saving results to %s' % FLAGS.output_directory)
# Run it!
_process_dataset('validation', FLAGS.validation_directory,
FLAGS.validation_shards, FLAGS.labels_file)
_process_dataset('train', FLAGS.train_directory,
FLAGS.train_shards, FLAGS.labels_file)
if __name__ == '__main__':
tf.app.run()
如果您使用直接读取字节的 tfrecord 文件太大。
这个link显示了它。
你用这个函数直接读取字节
img_bytes = open(path,'rb').read()
引用
您可以在此处使用 Kubeflow 管道进行转换:
https://aihub.cloud.google.com/u/0/p/products%2Fded3e5e5-d2e8-4d65-9b9f-5ffaa9a27ea1
点击下载link(创建一个Kubeflow集群到运行管道)
试试这个脚本: (与VOC分割数据集一起使用:http://host.robots.ox.ac.uk/pascal/VOC/voc2012/)
import numpy as np
import tensorflow as tf
import scipy.io # to read .mat files
from PIL import Image # to read image files
def get_image(path):
jpg = Image.open(path).convert('RGB')
return np.array(jpg)
def get_label_png(path):
png = Image.open(path) # image is saved as palettised png.
arr = np.array(png)
return arr[..., None]
def get_example(image, label):
feature = {
'height': tf.train.Feature(int64_list=tf.train.Int64List(value=[image.shape[0]])),
'width': tf.train.Feature(int64_list=tf.train.Int64List(value=[image.shape[1]])),
'image': tf.train.Feature(bytes_list=tf.train.BytesList(value=[image.tobytes()])),
'label': tf.train.Feature(bytes_list=tf.train.BytesList(value=[label.tobytes()]))
}
return tf.train.Example(features=tf.train.Features(feature=feature))
## Paths ======================================
images_folder = 'data/images/' #images folder
labels_folder = 'data/labels/' #label folder
train_file = 'data/train.txt'
val_file = 'data/val.txt'
TRAIN = 'data/train.tfrecords'
VAL = 'data/val.tfrecords'
## write train dataset
with tf.io.TFRecordWriter(TRAIN) as writer:
with open(train_file) as file:
filenames = [s.rstrip('\n') for s in file.readlines()]
for name in filenames:
image = utils.get_image(images_folder+name+'.jpg')
label = utils.get_label_png(labels_folder+name+'.png')
writer.write(utils.get_example(image, label).SerializeToString())
## write validation dataset
with tf.io.TFRecordWriter(VAL) as writer:
with open(val_file) as file:
filenames = [s.rstrip('\n') for s in file.readlines()]
for name in filenames:
image = utils.get_image(images_folder+name+'.jpg')
label = utils.get_label_png(labels_folder+name+'.png')
writer.write(utils.get_example(image, label).SerializeToString())