tensorflow-Keras LSTM VAE - 无法在 RHEL7 上转换符号张量错误 - Airflow

tensorflow-Keras LSTM VAE - Cannot convert a symbolic Tensor error on RHEL7 - Airflow

我有错误

{taskinstance.py:1455} ERROR - Cannot convert a symbolic Tensor (lstm_4/strided_slice:0) to a numpy array. This error may indicate that you're trying to pass a Tensor to a NumPy call, which is not supported

回溯(最近调用最后)

当我使用下面的代码创建我的 LSTM-VAE 模型时。

配置:

Python: 3.7.9
Tensorflow: 2.4.0
NumPy: 1.18.5

奇怪的是,相同的代码和配置在 Windows(也是 Windows 服务器)中运行良好,但在 RHEL7 中会导致错误。 (我正在研究气流) 我尝试升级到 numpy 1.19.5 和 tensorflow 2.4.1 但没有结果。

# Encoder
input_x = tensorflow.keras.layers.Input(
    shape=(time_steps, number_of_features)
)
encoder_lstm_int = tensorflow.keras.layers.LSTM(
    int_dim, return_sequences=True
)(input_x)
encoder_lstm_latent = tensorflow.keras.layers.LSTM(
    latent_dim, return_sequences=False
)(encoder_lstm_int)

z_mean = tensorflow.keras.layers.Dense(latent_dim)(encoder_lstm_latent)
z_log_sigma = tensorflow.keras.layers.Dense(latent_dim)(
    encoder_lstm_latent
)
z_encoder_output = _Sampling()([z_mean, z_log_sigma])

encoder: tensorflow.keras.models.Model = tensorflow.keras.models.Model(
    input_x, [z_mean, z_log_sigma, z_encoder_output]
)

# Decoder
decoder_input = tensorflow.keras.layers.Input(shape=(latent_dim))
decoder_repeated = tensorflow.keras.layers.RepeatVector(time_steps)(
    decoder_input
)
decoder_lstm_int = tensorflow.keras.layers.LSTM(
    int_dim, return_sequences=True
)(decoder_repeated)
decoder_lstm = tensorflow.keras.layers.LSTM(
    number_of_features, return_sequences=True
)(decoder_lstm_int)
decoder_dense1 = tensorflow.keras.layers.TimeDistributed(
    tensorflow.keras.layers.Dense(number_of_features * 2)
)(decoder_lstm)
decoder_output = tensorflow.keras.layers.TimeDistributed(
    tensorflow.keras.layers.Dense(number_of_features)
)(decoder_dense1)
decoder: tensorflow.keras.models.Model = tensorflow.keras.models.Model(
    decoder_input, decoder_output
)

# VAE
output = decoder(
    encoder(input_x)[2]
)  # this is the part encoder and decoder are connected together. Decoder
# takes the encoder output's[2] as input
lstm_vae: tensorflow.keras.models.Model = tensorflow.keras.models.Model(
    input_x, output, name='lstm_vae'
)

# Loss
rec_loss = (
    tensorflow.keras.backend.mean(
        tensorflow.keras.losses.mse(input_x, output)
    )
    * number_of_features
)
kl_loss = -0.5 * tensorflow.keras.backend.mean(
    1
    + z_log_sigma
    - tensorflow.keras.backend.square(z_mean)
    - tensorflow.keras.backend.exp(z_log_sigma)
)
vae_loss = rec_loss + kl_loss

lstm_vae.add_loss(vae_loss)
lstm_vae.compile(optimizer='adam', loss='mean_squared_error')

return encoder, decoder, lstm_vae

class _Sampling(tensorflow.keras.layers.Layer):
"""Sampling for encoder output."""

@staticmethod
def call(args):
    """
    Does sampling from the learned mu, std latent space for Decoder.
    """
    z_mean, z_log_sigma = args
    batch_size = tensorflow.shape(z_mean)[0]
    latent_dim = tensorflow.shape(z_mean)[1]
    epsilon = tensorflow.keras.backend.random_normal(
        shape=(batch_size, latent_dim), mean=0, stddev=1
    )
    return z_mean + tensorflow.keras.backend.exp(z_log_sigma / 2) * epsilon

Whosebug 中存在类似的问题,人们使用 NumPy 数组作为 Tensor 操作的一部分,但我的模型中也没有任何 NumPy 数组或 NumPy 操作。另一个解决方案是将 NumPy 从 1.20 降级到 1.18,但这已经是我的版本了。所以我现在一头雾水

回答我自己的问题: 发生这种情况只是因为 NumPy 1.20。即使我降级到 NumPy 1.18.5 我仍然收到错误,因为 Airflow 以某种方式缓存(在内存中或在 airflow/.local 中)之前安装的 NumPy(1.20) 并使用它尽管 pip 列表 1.18.5,所以我不得不在 airflow 的 .local 环境中删除 numpy 并重新启动机器,这已解决。

你的解决方案对我有用,谢谢!

我只想post一些触发错误的最小示例代码,一旦您降级 Numpy,这些代码就会得到纠正:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM

layer = LSTM(10, input_shape=(20,5)) # I'm pretty sure these numbers can be any values you like
print(layer)  # The LSTM is created without errors
model = Sequential()
model.add(layer) # Throws error