如何使用嵌套形状的 tf.data.Dataset.padded_batch?

How to use tf.data.Dataset.padded_batch with a nested shape?

我正在为每个元素构建一个数据集,其中包含两个形状为 [batch,width,heigh,3] 和 [batch,class] 的张量。为简单起见,假设 class = 5.

你向 dataset.padded_batch(1000,shape) 提供什么形状,以便沿 width/height/3 轴填充图像?

我尝试了以下方法:

tf.TensorShape([[None,None,None,3],[None,5]])
[tf.TensorShape([None,None,None,3]),tf.TensorShape([None,5])]
[[None,None,None,3],[None,5]]
([None,None,None,3],[None,5])
(tf.TensorShape([None,None,None,3]),tf.TensorShape([None,5])‌​)

每次引发 TypeError

The docs 状态:

padded_shapes: A nested structure of tf.TensorShape or tf.int64 vector tensor-like objects representing the shape to which the respective component of each input element should be padded prior to batching. Any unknown dimensions (e.g. tf.Dimension(None) in a tf.TensorShape or -1 in a tensor-like object) will be padded to the maximum size of that dimension in each batch.

相关代码:

dataset = tf.data.Dataset.from_generator(generator,tf.float32)
shapes = (tf.TensorShape([None,None,None,3]),tf.TensorShape([None,5]))
batch = dataset.padded_batch(1,shapes)

TensorShape 不接受嵌套列表。 tf.TensorShape([None, None, None, 3, None, 5])TensorShape(None)(注意没有 [])是合法的。

不过,结合这两个张量对我来说听起来很奇怪。我不确定您要完成什么,但我建议您在不组合不同维度的张量的情况下尝试这样做。

感谢 marry 找到解决方案。事实证明 from_generator 中的类型必须与条目中的张量数量相匹配。

新代码:

dataset = tf.data.Dataset.from_generator(generator,(tf.float32,tf.float32))
shapes = (tf.TensorShape([None,None,None,3]),tf.TensorShape([None,5]))
batch = dataset.padded_batch(1,shapes)