TensorFlow - 层间不同位宽量化

TensorFlow - Different bit-width quantization between layers

TensorFlow 是否支持层与层之间的不同位宽量化，还是必须在整个模型上执行相同的技术？

例如，假设我在 n 层执行 16-bit 量化。我可以在 n+1 层执行 8-bit 量化吗？

不，目前还没有为模型的不同层定义不同 dtype 的选项。

根据 tf.keras.layers.Layer 的 documentation。这是所有层继承的class。

dtype - The dtype of the layer's computations and weights (default of None means use tf.keras.backend.floatx in TensorFlow 2, or the type of the first input in TensorFlow 1).