tensorflow lite schema 中 SparsityParameters 的理解
Understanding of SparsityParameters in tensorflow lite schema
我正在尝试使用 tensorflow lite 模式来理解稀疏张量,但这对我来说很难做到。
幸运的是,只有一个 json 示例是从这个模式 (tensorflow/lite/testdata/sparse_tensor.json
) 中创建的。
"sparsity": {
"traversal_order": [0, 1, 2, 3],
"block_map": [0, 1],
"dim_metadata": [
{
"format": "DENSE",
"dense_size": 2
},
{
"format": "SPARSE_CSR",
"array_segments_type": "Uint8Vector",
"array_segments": {"values": [0, 2, 3]},
"array_indices_type": "Uint8Vector",
"array_indices": {"values": [0, 1, 1]}
},
{
"format": "DENSE",
"dense_size": 2
},
{
"format": "DENSE",
"dense_size": 2
}
]
}
"buffers": [
{
},
{
"data": [
1, 0, 0, 4,
2, 3, 0, 0,
5, 0, 0, 6
]
}
]
而且,这是我引用的模式(tensorflow/lite/schema/schema.fbs
)。
table DimensionMetadata {
// Whether a dimension is dense or sparse.
format:DimensionType;
// Index metadata used for a dimension.
// - If format is DimensionType.DENSE then we use the dense_size field to
// store the size of that dimension. Each index in that dimension is
// stored implicitly.
// - If format is DimensionType.SPARSE_CSR then we use array_segments and
// array_indices to encode that dimension. array_segments represents how
// to segment the indices array, each segment corresponds to one element
// in the previous dimension. array_indices represents the index of the
// non-zero elements within this dimension (as those in the CSR matrix
// format, where the first array is row pointers and the second array is
// column indices).
dense_size:int;
array_segments:SparseIndexVector;
array_indices:SparseIndexVector;
}
// Parameters to encode a sparse TfLite tensor.
table SparsityParameters {
// The traversal order of the dimensions defined in the `shape` field of the
// conceptual dense tensor. For a n-dimensional tensors with dims (d0, d1,
// ..., dn-1),
// - if not block sparse, the traversal_order is just a permutation of (d0,
// ..., dn-1). For example, a 2-D matrix stored in row-major order would
// have traversal_order = (d0, d1).
// - if block sparse with a k-dimensional block (0 <= k <= n), the
// traversal_order has n + k elements. The first n elements are still a
// permutation of (d0, ..., dn-1). The lask k elements are a permutation
// of (dn, ..., dn+k-1), defining how to traverse a block internally. For
// example, a 2-D matrix with 2-D blocks, both stored in row-major order
// would have traversal_order = (d0, d1, d2, d3).
traversal_order:[int];
// For an n-dimensional tensor with a k-dimensional block (0 <= k <= n),
// stores how a block dimension in (dn, ..., dn+k-1) maps to the original
// tensor dimension in (d0, ..., dn).
// It's stored in the order of (dn, ..., dn+k-1).
// If not block-sparse, this field is NULL.
block_map:[int];
// In the traversal order defined above, the metadata needed for
// each dimension to locate the non-zero values in the original dense tensor.
// The size of the dim_metadata array = the size of the traversal_order array
// = n + k.
dim_metadata:[DimensionMetadata];
}
正如您在上面看到的,有一个包含稀疏张量内容的缓冲区。
"buffers": [
{
},
{
"data": [
1, 0, 0, 4,
2, 3, 0, 0,
5, 0, 0, 6
]
}
]
AFAIK,如果我想像上面那样生成稀疏张量,我必须编写如下代码。
st1 = tf.compat.v1.sparse.SparseTensor(
indices=[[0, 0], [0, 3], [1, 0], [1, 1], [2, 0], [2, 2]], values=buffers, dense_shape=[4, 4])
但是,以上json例子与我的理解完全不符。
- 我认为
array_indices
应该是 [0, 3, 0, 1, 0, 3] 而不是 [0, 1, 1] 而 array_segements
应该是 [0, 2, 2, 4, 4, 6] 而不是 [0, 2, 3].
此外,实际上,none 完全理解架构中的评论..
- DENSE 格式的元数据代表什么?
{
"format": "DENSE",
"dense_size": 2
},
正如shcema的评论所说,它是一个字段,用于存储该维度的大小。
但是,哪个维度的值为“2”?形状是 [4, 4]。我什至无法推断数字 2 从何而来。
- 什么是“块”?
据我所知,块是一个包含非零值的框。但是,我认为上面的缓冲区中有很多块。
如果有这样的块,
1 2 0
3 4 0
0 0 0
我会说,它有一个 2x2 块。
但是上面的缓冲区中有六个 1x1 块。
那我怎么把这个东西做成方块图..?
其实我也不知道traversal_order
,但如果我知道以上这些,我也能理解。
请有人帮助我..
目前 TFLite 使用与 Tensorflow 不同的稀疏张量表示。它使用的格式称为 TACO。有关更多详细信息,请参阅本文:http://tensor-compiler.org/kjolstad-oopsla17-tensor-compiler.pdf,第 3 节
tf.SparseTensor 不适用于此,因为它使用 COO 格式。
块是需要存储在一起的张量的内部sub-unit。它可以包含 0 值元素。示例 flatbuffer 显示了一个带有 2-D 2x2 内部块的 2-D 4x4 张量。 TFLite 使用分块稀疏张量来利用 NEON SIMD 指令。
我正在尝试使用 tensorflow lite 模式来理解稀疏张量,但这对我来说很难做到。
幸运的是,只有一个 json 示例是从这个模式 (tensorflow/lite/testdata/sparse_tensor.json
) 中创建的。
"sparsity": {
"traversal_order": [0, 1, 2, 3],
"block_map": [0, 1],
"dim_metadata": [
{
"format": "DENSE",
"dense_size": 2
},
{
"format": "SPARSE_CSR",
"array_segments_type": "Uint8Vector",
"array_segments": {"values": [0, 2, 3]},
"array_indices_type": "Uint8Vector",
"array_indices": {"values": [0, 1, 1]}
},
{
"format": "DENSE",
"dense_size": 2
},
{
"format": "DENSE",
"dense_size": 2
}
]
}
"buffers": [
{
},
{
"data": [
1, 0, 0, 4,
2, 3, 0, 0,
5, 0, 0, 6
]
}
]
而且,这是我引用的模式(tensorflow/lite/schema/schema.fbs
)。
table DimensionMetadata {
// Whether a dimension is dense or sparse.
format:DimensionType;
// Index metadata used for a dimension.
// - If format is DimensionType.DENSE then we use the dense_size field to
// store the size of that dimension. Each index in that dimension is
// stored implicitly.
// - If format is DimensionType.SPARSE_CSR then we use array_segments and
// array_indices to encode that dimension. array_segments represents how
// to segment the indices array, each segment corresponds to one element
// in the previous dimension. array_indices represents the index of the
// non-zero elements within this dimension (as those in the CSR matrix
// format, where the first array is row pointers and the second array is
// column indices).
dense_size:int;
array_segments:SparseIndexVector;
array_indices:SparseIndexVector;
}
// Parameters to encode a sparse TfLite tensor.
table SparsityParameters {
// The traversal order of the dimensions defined in the `shape` field of the
// conceptual dense tensor. For a n-dimensional tensors with dims (d0, d1,
// ..., dn-1),
// - if not block sparse, the traversal_order is just a permutation of (d0,
// ..., dn-1). For example, a 2-D matrix stored in row-major order would
// have traversal_order = (d0, d1).
// - if block sparse with a k-dimensional block (0 <= k <= n), the
// traversal_order has n + k elements. The first n elements are still a
// permutation of (d0, ..., dn-1). The lask k elements are a permutation
// of (dn, ..., dn+k-1), defining how to traverse a block internally. For
// example, a 2-D matrix with 2-D blocks, both stored in row-major order
// would have traversal_order = (d0, d1, d2, d3).
traversal_order:[int];
// For an n-dimensional tensor with a k-dimensional block (0 <= k <= n),
// stores how a block dimension in (dn, ..., dn+k-1) maps to the original
// tensor dimension in (d0, ..., dn).
// It's stored in the order of (dn, ..., dn+k-1).
// If not block-sparse, this field is NULL.
block_map:[int];
// In the traversal order defined above, the metadata needed for
// each dimension to locate the non-zero values in the original dense tensor.
// The size of the dim_metadata array = the size of the traversal_order array
// = n + k.
dim_metadata:[DimensionMetadata];
}
正如您在上面看到的,有一个包含稀疏张量内容的缓冲区。
"buffers": [
{
},
{
"data": [
1, 0, 0, 4,
2, 3, 0, 0,
5, 0, 0, 6
]
}
]
AFAIK,如果我想像上面那样生成稀疏张量,我必须编写如下代码。
st1 = tf.compat.v1.sparse.SparseTensor(
indices=[[0, 0], [0, 3], [1, 0], [1, 1], [2, 0], [2, 2]], values=buffers, dense_shape=[4, 4])
但是,以上json例子与我的理解完全不符。
- 我认为
array_indices
应该是 [0, 3, 0, 1, 0, 3] 而不是 [0, 1, 1] 而array_segements
应该是 [0, 2, 2, 4, 4, 6] 而不是 [0, 2, 3].
此外,实际上,none 完全理解架构中的评论..
- DENSE 格式的元数据代表什么?
{
"format": "DENSE",
"dense_size": 2
},
正如shcema的评论所说,它是一个字段,用于存储该维度的大小。
但是,哪个维度的值为“2”?形状是 [4, 4]。我什至无法推断数字 2 从何而来。
- 什么是“块”?
据我所知,块是一个包含非零值的框。但是,我认为上面的缓冲区中有很多块。
如果有这样的块,
1 2 0
3 4 0
0 0 0
我会说,它有一个 2x2 块。
但是上面的缓冲区中有六个 1x1 块。
那我怎么把这个东西做成方块图..?
其实我也不知道traversal_order
,但如果我知道以上这些,我也能理解。
请有人帮助我..
目前 TFLite 使用与 Tensorflow 不同的稀疏张量表示。它使用的格式称为 TACO。有关更多详细信息,请参阅本文:http://tensor-compiler.org/kjolstad-oopsla17-tensor-compiler.pdf,第 3 节
tf.SparseTensor 不适用于此,因为它使用 COO 格式。
块是需要存储在一起的张量的内部sub-unit。它可以包含 0 值元素。示例 flatbuffer 显示了一个带有 2-D 2x2 内部块的 2-D 4x4 张量。 TFLite 使用分块稀疏张量来利用 NEON SIMD 指令。