如何写一个tfrecord文件并读取它？错误是：0' 处的截断记录失败，读取的字节数少于请求的字节数 [Op:IteratorGetNext]

Question

我想使用 tfrecord 来处理大量的 MRI 图像，但我不知道该怎么做。下面是我的代码、错误和数据 link。（抱歉，如果您发现代码有点长）。

关于数据：

484张训练图片，每张的shape为(240, 240, 155, 4)，这4张数字是高度、宽度、层数和通道数分别。
484 个标签，每个标签的形状为 (240, 240, 155)

首先我重新整理了我的数据，

image_data_path = './drive/MyDrive/Brain Tumour/Task01_BrainTumour/imagesTr/'
label_data_path = './drive/MyDrive/Brain Tumour/Task01_BrainTumour/labelsTr/'

image_paths = [image_data_path + name 
               for name in os.listdir(image_data_path) 
               if not name.startswith(".")]

label_paths = [label_data_path + name
               for name in os.listdir(label_data_path)
               if not name.startswith(".")]

image_paths = sorted(image_paths)
label_paths = sorted(label_paths)

并定义一个函数来加载1个nii文件。我用的是nibabel.

def load_one_sample(image_path, label_path):

  image = nib.load(image_path).get_fdata()
  label = nib.load(label_path).get_fdata().astype(int)  # the original dtype is float64

  return image, label

这里我写了一些辅助函数，'float'用于图像，'int'用于标签：

def float_feature(value):
  return tf.train.Feature(float_list = tf.train.FloatList(value = value))

def int64_feature(value):
  return tf.train.Feature(int64_list = tf.train.Int64List(value = value))

def create_example(image_path, label_path):

  image, label = load_one_sample(image_path, label_path)
  image, label = image.ravel(), label.ravel()
  feature = {'image': float_feature(image),
             'label': int64_feature(label)}
  example = tf.train.Example(features = tf.train.Features(feature = feature))

  return example

def parse_tfrecord(example):

  feature = {'image': tf.io.FixedLenFeature([240, 240, 155, 4], tf.float32),
             'label': tf.io.FixedLenFeature([240, 240, 155], tf.int64)}
  parsed_example = tf.io.parse_single_example(example, feature)

  return parsed_example

然后开始转换和读取tfrecord，仅用一个例子：

test_writer = tf.io.TFRecordWriter('test.tfrecords')

example = create_example(image_paths[0], label_paths[0])
test_writer.write(example.SerializeToString())

serialised_example = tf.data.TFRecordDataset('test.tfrecords')
parsed_example = serialised_example.map(parse_tfrecord)

最后我尝试绘制一张图像，但收到此错误消息：

for features in parsed_example.take(1):
  plt.imshow(features['image'][:, :, 100, 0])

错误： 0' 处的截断记录失败，读取的字节少于请求的字节数 [Op:IteratorGetNext]

数据link：https://drive.google.com/drive/folders/1HqEgzS8BV2c7xYNrZdEAnrHk7osJJ--2（任务 1 - 脑肿瘤）

我哪里错了？

Answer 1

发生此错误是因为您在将示例写入文件后从未调用 close()：这是一个使用随机数组的工作示例：

import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np

def float_feature(value):
  return tf.train.Feature(float_list = tf.train.FloatList(value = value))

def int64_feature(value):
  return tf.train.Feature(int64_list = tf.train.Int64List(value = value))

def create_example():

  image, label = np.random.random((16, 16, 155, 4)), np.random.randint(20, size=(16, 16, 155))
  image, label = image.ravel(), label.ravel()
  feature = {'image': float_feature(image),
             'label': int64_feature(label)}
  example = tf.train.Example(features = tf.train.Features(feature = feature))
  return example

def parse_tfrecord(example):
  feature = {'image': tf.io.FixedLenFeature([16, 16, 155, 4], tf.float32),
             'label': tf.io.FixedLenFeature([16, 16, 155], tf.int64)}
  parsed_example = tf.io.parse_single_example(example, feature)

  return parsed_example

test_writer = tf.io.TFRecordWriter('test.tfrecords')

example = create_example()
test_writer.write(example.SerializeToString())
test_writer.close() 

serialised_example = tf.data.TFRecordDataset('test.tfrecords')
parsed_example = serialised_example.map(parse_tfrecord)

for features in parsed_example.take(1):
  plt.imshow(features['image'][:, :, 100, 0])

如何写一个tfrecord文件并读取它？错误是：0' 处的截断记录失败，读取的字节数少于请求的字节数 [Op:IteratorGetNext]

How to write a tfrecord file and read it? The error is: truncated record at 0' failed with Read less bytes than requested [Op:IteratorGetNext]

python

tensorflow

tfrecord

tensorflow-datasets