Tensorflow Dataset.from_generator 失败并出现 pyfunc 异常
Tensorflow Dataset.from_generator fails with pyfunc exception
我正在根据需要尝试使用 tensorflow 的 nightly 1.4 Dataset.from_generator
to stich together some variable length datasets. This simple code (idea from ):
import tensorflow as tf
Dataset = tf.contrib.data.Dataset
it2 = Dataset.range(5).make_one_shot_iterator()
def _dataset_generator():
while True:
try:
try:
get_next = it2.get_next()
yield get_next
except tf.errors.OutOfRangeError:
continue
except tf.errors.OutOfRangeError:
return
# Dataset.from_generator need tensorflow > 1.3 !
das_dataset = Dataset.from_generator(_dataset_generator,
output_types=(tf.float32, tf.float32))
das_dataset_it = das_dataset.make_one_shot_iterator()
with tf.Session() as sess:
while True:
print(sess.run(it2.get_next()))
print(sess.run(das_dataset_it.get_next()))
因相当神秘而失败:
C:\Dropbox\_\PyCharmVirtual\TF-NIGHTLY\Scripts\python.exe C:/Users/MrD/.PyCharm2017.2/config/scratches/scratch_55.py
0
2017-10-01 12:51:39.773135: W C:\tf_jenkins\home\workspace\tf-nightly-windows\M\windows\PY\tensorflow\core\framework\op_kernel.cc:1192] Invalid argument: 0-th value returned by pyfunc_0 is int32, but expects int64
[[Node: PyFunc = PyFunc[Tin=[], Tout=[DT_INT64], token="pyfunc_0"]()]]
Traceback (most recent call last):
File "C:\Dropbox\_\PyCharmVirtual\TF-NIGHTLY\lib\site-packages\tensorflow\python\client\session.py", line 1323, in _do_call
return fn(*args)
File "C:\Dropbox\_\PyCharmVirtual\TF-NIGHTLY\lib\site-packages\tensorflow\python\client\session.py", line 1302, in _run_fn
status, run_metadata)
File "C:\_\Python35\lib\contextlib.py", line 66, in __exit__
next(self.gen)
File "C:\Dropbox\_\PyCharmVirtual\TF-NIGHTLY\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 467, in raise_exception_on_not_ok_status
c_api.TF_GetCode(status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: 0-th value returned by pyfunc_0 is int32, but expects int64
[[Node: PyFunc = PyFunc[Tin=[], Tout=[DT_INT64], token="pyfunc_0"]()]]
[[Node: IteratorGetNext_1 = IteratorGetNext[output_shapes=[<unknown>, <unknown>], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](OneShotIterator_1)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/MrD/.PyCharm2017.2/config/scratches/scratch_55.py", line 24, in <module>
print(sess.run(das_dataset_it.get_next()))
File "C:\Dropbox\_\PyCharmVirtual\TF-NIGHTLY\lib\site-packages\tensorflow\python\client\session.py", line 889, in run
run_metadata_ptr)
File "C:\Dropbox\_\PyCharmVirtual\TF-NIGHTLY\lib\site-packages\tensorflow\python\client\session.py", line 1120, in _run
feed_dict_tensor, options, run_metadata)
File "C:\Dropbox\_\PyCharmVirtual\TF-NIGHTLY\lib\site-packages\tensorflow\python\client\session.py", line 1317, in _do_run
options, run_metadata)
File "C:\Dropbox\_\PyCharmVirtual\TF-NIGHTLY\lib\site-packages\tensorflow\python\client\session.py", line 1336, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 0-th value returned by pyfunc_0 is int32, but expects int64
[[Node: PyFunc = PyFunc[Tin=[], Tout=[DT_INT64], token="pyfunc_0"]()]]
[[Node: IteratorGetNext_1 = IteratorGetNext[output_shapes=[<unknown>, <unknown>], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](OneShotIterator_1)]]
Process finished with exit code 1
注意生成器工作正常:
with tf.Session() as sess:
for k in _dataset_generator():
print(sess.run(k))
打印:
0
1
2
3
4
Traceback (most recent call last):
...
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
[[Node: IteratorGetNext_5 = IteratorGetNext[output_shapes=[[]], output_types=[DT_INT64], _device="/job:localhost/replica:0/task:0/cpu:0"](OneShotIterator)]]
符合预期。
这是错误、缺失的功能还是我严重误解了什么?
Dataset.from_generator()
方法旨在将非 TensorFlow Python 代码连接到 tf.data
输入管道。例如,您可以从生成器生成简单的 Python 对象(例如 int
和 str
对象)、列表或 NumPy 数组,它们将被转换为 TensorFlow 值。
但是,在您的示例代码中,您生成了 it.get_next()
的结果,它是一个 tf.Tensor
对象。这是不支持的。
如果需要在不同的数据集中捕获迭代器,可以在虚拟数据集上使用 Dataset.map()
,如下所示:
import tensorflow as tf
Dataset = tf.contrib.data.Dataset
it2 = Dataset.range(5).make_one_shot_iterator()
das_dataset = Dataset.from_tensors(0).repeat().map(lambda _: it2.get_next())
das_dataset_it = das_dataset.make_one_shot_iterator()
with tf.Session() as sess:
while True:
print(sess.run(it2.get_next()))
print(sess.run(das_dataset_it.get_next()))
我正在根据需要尝试使用 tensorflow 的 nightly 1.4 Dataset.from_generator
to stich together some variable length datasets. This simple code (idea from
import tensorflow as tf
Dataset = tf.contrib.data.Dataset
it2 = Dataset.range(5).make_one_shot_iterator()
def _dataset_generator():
while True:
try:
try:
get_next = it2.get_next()
yield get_next
except tf.errors.OutOfRangeError:
continue
except tf.errors.OutOfRangeError:
return
# Dataset.from_generator need tensorflow > 1.3 !
das_dataset = Dataset.from_generator(_dataset_generator,
output_types=(tf.float32, tf.float32))
das_dataset_it = das_dataset.make_one_shot_iterator()
with tf.Session() as sess:
while True:
print(sess.run(it2.get_next()))
print(sess.run(das_dataset_it.get_next()))
因相当神秘而失败:
C:\Dropbox\_\PyCharmVirtual\TF-NIGHTLY\Scripts\python.exe C:/Users/MrD/.PyCharm2017.2/config/scratches/scratch_55.py
0
2017-10-01 12:51:39.773135: W C:\tf_jenkins\home\workspace\tf-nightly-windows\M\windows\PY\tensorflow\core\framework\op_kernel.cc:1192] Invalid argument: 0-th value returned by pyfunc_0 is int32, but expects int64
[[Node: PyFunc = PyFunc[Tin=[], Tout=[DT_INT64], token="pyfunc_0"]()]]
Traceback (most recent call last):
File "C:\Dropbox\_\PyCharmVirtual\TF-NIGHTLY\lib\site-packages\tensorflow\python\client\session.py", line 1323, in _do_call
return fn(*args)
File "C:\Dropbox\_\PyCharmVirtual\TF-NIGHTLY\lib\site-packages\tensorflow\python\client\session.py", line 1302, in _run_fn
status, run_metadata)
File "C:\_\Python35\lib\contextlib.py", line 66, in __exit__
next(self.gen)
File "C:\Dropbox\_\PyCharmVirtual\TF-NIGHTLY\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 467, in raise_exception_on_not_ok_status
c_api.TF_GetCode(status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: 0-th value returned by pyfunc_0 is int32, but expects int64
[[Node: PyFunc = PyFunc[Tin=[], Tout=[DT_INT64], token="pyfunc_0"]()]]
[[Node: IteratorGetNext_1 = IteratorGetNext[output_shapes=[<unknown>, <unknown>], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](OneShotIterator_1)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/MrD/.PyCharm2017.2/config/scratches/scratch_55.py", line 24, in <module>
print(sess.run(das_dataset_it.get_next()))
File "C:\Dropbox\_\PyCharmVirtual\TF-NIGHTLY\lib\site-packages\tensorflow\python\client\session.py", line 889, in run
run_metadata_ptr)
File "C:\Dropbox\_\PyCharmVirtual\TF-NIGHTLY\lib\site-packages\tensorflow\python\client\session.py", line 1120, in _run
feed_dict_tensor, options, run_metadata)
File "C:\Dropbox\_\PyCharmVirtual\TF-NIGHTLY\lib\site-packages\tensorflow\python\client\session.py", line 1317, in _do_run
options, run_metadata)
File "C:\Dropbox\_\PyCharmVirtual\TF-NIGHTLY\lib\site-packages\tensorflow\python\client\session.py", line 1336, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 0-th value returned by pyfunc_0 is int32, but expects int64
[[Node: PyFunc = PyFunc[Tin=[], Tout=[DT_INT64], token="pyfunc_0"]()]]
[[Node: IteratorGetNext_1 = IteratorGetNext[output_shapes=[<unknown>, <unknown>], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](OneShotIterator_1)]]
Process finished with exit code 1
注意生成器工作正常:
with tf.Session() as sess:
for k in _dataset_generator():
print(sess.run(k))
打印:
0
1
2
3
4
Traceback (most recent call last):
...
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
[[Node: IteratorGetNext_5 = IteratorGetNext[output_shapes=[[]], output_types=[DT_INT64], _device="/job:localhost/replica:0/task:0/cpu:0"](OneShotIterator)]]
符合预期。
这是错误、缺失的功能还是我严重误解了什么?
Dataset.from_generator()
方法旨在将非 TensorFlow Python 代码连接到 tf.data
输入管道。例如,您可以从生成器生成简单的 Python 对象(例如 int
和 str
对象)、列表或 NumPy 数组,它们将被转换为 TensorFlow 值。
但是,在您的示例代码中,您生成了 it.get_next()
的结果,它是一个 tf.Tensor
对象。这是不支持的。
如果需要在不同的数据集中捕获迭代器,可以在虚拟数据集上使用 Dataset.map()
,如下所示:
import tensorflow as tf
Dataset = tf.contrib.data.Dataset
it2 = Dataset.range(5).make_one_shot_iterator()
das_dataset = Dataset.from_tensors(0).repeat().map(lambda _: it2.get_next())
das_dataset_it = das_dataset.make_one_shot_iterator()
with tf.Session() as sess:
while True:
print(sess.run(it2.get_next()))
print(sess.run(das_dataset_it.get_next()))