Tensorflow 中的数据增强是如何实现的?
How is data augmentation implemented in Tensorflow?
基于 Tensorflow tutorial for ConvNet,有些观点对我来说不是很明显:
- 被扭曲的图像是否真的添加到原始图像池中?
- 还是使用扭曲的图像 而不是 原件?
- 产生了多少扭曲的图像? (即定义了什么增强因子?)[=34=]
教程的函数流程似乎如下:
cifar_10_train.py
def train
"""Train CIFAR-10 for a number of steps."""
with tf.Graph().as_default():
[...]
# Get images and labels for CIFAR-10.
images, labels = cifar10.distorted_inputs()
[...]
cifar10.py
def distorted_inputs():
"""Construct distorted input for CIFAR training using the Reader ops.
Returns:
images: Images. 4D tensor of [batch_size, IMAGE_SIZE, IMAGE_SIZE, 3] size.
labels: Labels. 1D tensor of [batch_size] size.
Raises:
ValueError: If no data_dir
"""
if not FLAGS.data_dir:
raise ValueError('Please supply a data_dir')
data_dir = os.path.join(FLAGS.data_dir, 'cifar-10-batches-bin')
return cifar10_input.distorted_inputs(data_dir=data_dir,
batch_size=FLAGS.batch_size)
最后是cifar10_input.py
def distorted_inputs(data_dir, batch_size):
"""Construct distorted input for CIFAR training using the Reader ops.
Args:
data_dir: Path to the CIFAR-10 data directory.
batch_size: Number of images per batch.
Returns:
images: Images. 4D tensor of [batch_size, IMAGE_SIZE, IMAGE_SIZE, 3] size.
labels: Labels. 1D tensor of [batch_size] size.
"""
filenames = [os.path.join(data_dir, 'data_batch_%d.bin' % i) for i in xrange(1, 6)]
for f in filenames:
if not tf.gfile.Exists(f):
raise ValueError('Failed to find file: ' + f)
# Create a queue that produces the filenames to read.
filename_queue = tf.train.string_input_producer(filenames)
# Read examples from files in the filename queue.
read_input = read_cifar10(filename_queue)
reshaped_image = tf.cast(read_input.uint8image, tf.float32)
height = IMAGE_SIZE
width = IMAGE_SIZE
# Image processing for training the network. Note the many random
# distortions applied to the image.
# Randomly crop a [height, width] section of the image.
distorted_image = tf.random_crop(reshaped_image, [height, width, 3])
# Randomly flip the image horizontally.
distorted_image = tf.image.random_flip_left_right(distorted_image)
# Because these operations are not commutative, consider randomizing
# the order their operation.
distorted_image = tf.image.random_brightness(distorted_image, max_delta=63)
distorted_image = tf.image.random_contrast(distorted_image, lower=0.2, upper=1.8)
# Subtract off the mean and divide by the variance of the pixels.
float_image = tf.image.per_image_whitening(distorted_image)
# Ensure that the random shuffling has good mixing properties.
min_fraction_of_examples_in_queue = 0.4
min_queue_examples = int(NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN *
min_fraction_of_examples_in_queue)
print('Filling queue with %d CIFAR images before starting to train.'
'This will take a few minutes.' % min_queue_examples)
# Generate a batch of images and labels by building up a queue of examples.
return _generate_image_and_label_batch(float_image, read_input.label,
min_queue_examples, batch_size,
shuffle=True)
are the images being distorted actually added to the pool of original images?
这取决于池的定义。在 tensorflow 中,你有 ops
,它们是网络图中的基本对象。在这里,数据生产本身就是一种操作。因此,您没有有限的训练样本集,而是 潜在的无限 样本集 从训练集中生成。
or are the distorted images used instead of the originals?
正如您从包含的来源中看到的那样 - 样本取自训练批次,然后随机变换,因此使用未更改图像的可能性非常小(尤其是使用裁剪,它总是会修改) .
how many distorted images are being produced? (i.e. what augmentation factor was defined?)
没有这样的事情,这是永无止境的过程。从随机访问可能无限的数据源的角度考虑这一点,因为这是这里有效发生的事情。每一批都可以与前一批不同。
基于 Tensorflow tutorial for ConvNet,有些观点对我来说不是很明显:
- 被扭曲的图像是否真的添加到原始图像池中?
- 还是使用扭曲的图像 而不是 原件?
- 产生了多少扭曲的图像? (即定义了什么增强因子?)[=34=]
教程的函数流程似乎如下:
cifar_10_train.py
def train
"""Train CIFAR-10 for a number of steps."""
with tf.Graph().as_default():
[...]
# Get images and labels for CIFAR-10.
images, labels = cifar10.distorted_inputs()
[...]
cifar10.py
def distorted_inputs():
"""Construct distorted input for CIFAR training using the Reader ops.
Returns:
images: Images. 4D tensor of [batch_size, IMAGE_SIZE, IMAGE_SIZE, 3] size.
labels: Labels. 1D tensor of [batch_size] size.
Raises:
ValueError: If no data_dir
"""
if not FLAGS.data_dir:
raise ValueError('Please supply a data_dir')
data_dir = os.path.join(FLAGS.data_dir, 'cifar-10-batches-bin')
return cifar10_input.distorted_inputs(data_dir=data_dir,
batch_size=FLAGS.batch_size)
最后是cifar10_input.py
def distorted_inputs(data_dir, batch_size):
"""Construct distorted input for CIFAR training using the Reader ops.
Args:
data_dir: Path to the CIFAR-10 data directory.
batch_size: Number of images per batch.
Returns:
images: Images. 4D tensor of [batch_size, IMAGE_SIZE, IMAGE_SIZE, 3] size.
labels: Labels. 1D tensor of [batch_size] size.
"""
filenames = [os.path.join(data_dir, 'data_batch_%d.bin' % i) for i in xrange(1, 6)]
for f in filenames:
if not tf.gfile.Exists(f):
raise ValueError('Failed to find file: ' + f)
# Create a queue that produces the filenames to read.
filename_queue = tf.train.string_input_producer(filenames)
# Read examples from files in the filename queue.
read_input = read_cifar10(filename_queue)
reshaped_image = tf.cast(read_input.uint8image, tf.float32)
height = IMAGE_SIZE
width = IMAGE_SIZE
# Image processing for training the network. Note the many random
# distortions applied to the image.
# Randomly crop a [height, width] section of the image.
distorted_image = tf.random_crop(reshaped_image, [height, width, 3])
# Randomly flip the image horizontally.
distorted_image = tf.image.random_flip_left_right(distorted_image)
# Because these operations are not commutative, consider randomizing
# the order their operation.
distorted_image = tf.image.random_brightness(distorted_image, max_delta=63)
distorted_image = tf.image.random_contrast(distorted_image, lower=0.2, upper=1.8)
# Subtract off the mean and divide by the variance of the pixels.
float_image = tf.image.per_image_whitening(distorted_image)
# Ensure that the random shuffling has good mixing properties.
min_fraction_of_examples_in_queue = 0.4
min_queue_examples = int(NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN *
min_fraction_of_examples_in_queue)
print('Filling queue with %d CIFAR images before starting to train.'
'This will take a few minutes.' % min_queue_examples)
# Generate a batch of images and labels by building up a queue of examples.
return _generate_image_and_label_batch(float_image, read_input.label,
min_queue_examples, batch_size,
shuffle=True)
are the images being distorted actually added to the pool of original images?
这取决于池的定义。在 tensorflow 中,你有 ops
,它们是网络图中的基本对象。在这里,数据生产本身就是一种操作。因此,您没有有限的训练样本集,而是 潜在的无限 样本集 从训练集中生成。
or are the distorted images used instead of the originals?
正如您从包含的来源中看到的那样 - 样本取自训练批次,然后随机变换,因此使用未更改图像的可能性非常小(尤其是使用裁剪,它总是会修改) .
how many distorted images are being produced? (i.e. what augmentation factor was defined?)
没有这样的事情,这是永无止境的过程。从随机访问可能无限的数据源的角度考虑这一点,因为这是这里有效发生的事情。每一批都可以与前一批不同。