如何在 Tensorflow 2.0 中使用 tf.python_io.TFRecordWriter

How to use tf.python_io.TFRecordWriter in Tensorflow 2.0

我想将 .csv 文件转换为 TF 记录。我现在的问题是,python_io 在 Tensorflow 2.0 中不存在。

     writer=tf.python_io.TFRecordWriter(FLAGS.output_path)
     path = os.path.join(FLAGS.image_dir)
     examples = pd.read_csv(FLAGS.csv_input)
     grouped = split(examples, 'filename')
     for group in grouped:
        tf_example = create_tf_example(group, path)
        writer.write(tf_example.SerializeToString())
     writer.close()
     output_path = os.path.join(os.getcwd(), FLAGS.output_path)
I get this error:
File "generate_tfrecord.py", line 102, in <module>
main()
File "generate_tfrecord.py", line 89, in main
writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
AttributeError: module 'tensorflow' has no attribute 'python_io'

我必须改用什么?

根据官方 documentationpython.io 包已移至名为 io 的新包。只需交换它们即可。

writer = tf.io.TFRecordWriter(FLAGS.output_path)
# ...

在 TF 2.0 中 tf.python_io.TFRecordWriter() 已更改为 tf.io.TFRecordWriter()

Tf.io.TFRecordWriter是小数据集的官方写tfrecords的方式。您还可以尝试 apache beam 并行化从 CSV 到 tfrecords 到更大的 CSV 的转换。

import tensorflow as tf
import numpy as np
tf.debugging.set_log_device_placement(True)

with tf.io.TFRecordWriter('data.tfrecords') as file_writer:

    x = tf.random.normal([100, 1])
    y = tf.random.normal([100, 1])

    feature = {
    "x": tf.train.Feature(float_list=tf.train.FloatList(value=x.numpy().astype(np.float))),
    "y": tf.train.Feature(float_list=tf.train.FloatList(value=y.numpy().astype(np.float)))
    }

    example = tf.train.Example(features=tf.train.Features(feature=feature))
    record_bytes = example.SerializeToString()
    file_writer.write(record_bytes)
    print(record_bytes)