基于张量流中的连续掩码值将张量拆分为动态长度张量？

Question

我正在尝试弄清楚如何根据使用二进制数“1”的值对连续掩码进行分区，将我的顺序数据张量拆分为多个部分。

我已经阅读了官方文档。但是我找不到任何可以轻松处理这个问题的功能。 python?

中有任何有用的方法吗？

我试过 'tf.ragged.boolean_mask' 但它似乎不适合我的情况。

我解释的形象化例子是：

输入：

# both are tensors, NOT data.
data_tensor = ([3,5,6,2,6,1,3,9,5])
mask_tensor = ([0,1,1,1,0,0,1,1,0])

预期输出：

output_tensor = ([[3],[5,6,2],[6,1],[3,9],[5]])

谢谢。

Answer 1

我最近在@AloneTogether 中发现了一种非常干净的方法：

import tensorflow as tf

data_tensor = tf.constant([3,5,6,2,6,1,3,9,5])
mask_tensor = tf.constant([0,1,1,1,0,0,1,1,0])

# Index where the mask changes.
change_idx = tf.concat([tf.where(mask_tensor[:-1] != mask_tensor[1:])[:, 0], [tf.shape(mask_tensor)[0]-1]], axis=0)

# Ranges of indices to gather.
ragged_idx = tf.ragged.range(tf.concat([[0], change_idx[:-1] + 1], axis=0), change_idx + 1)

# Gather ranges into ragged tensor.
output_tensor = tf.gather(data_tensor, ragged_idx)

print(output_tensor)

<tf.RaggedTensor [[3], [5, 6, 2], [6, 1], [3, 9], [5]]>

基于张量流中的连续掩码值将张量拆分为动态长度张量？

Tensor split to dynamic length tensors based on continuous mask values in tensorflow?

python

tensorflow