Colab+Keras+TensorBoard FailedPreconditionError

Colab+Keras+TensorBoard FailedPreconditionError

我正在尝试 运行 一个简单的 Keras 脚本并使用 Google Colab 和 TensorBoard。这是我的代码:

import tensorflow as tf
import tensorflow.keras as keras
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.applications.mobilenet import MobileNet
from tensorboardcolab import TensorBoardColab, TensorBoardColabCallback

# Settings
num_classes = 10
batch_size = 16
epochs = 1

# Data setup
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

# Select model
model = MobileNet(weights=None, input_shape=x_train.shape[1:], classes=num_classes)

# Select loss, optimizer, metric
model.compile(loss='categorical_crossentropy',
                            optimizer=tf.train.AdamOptimizer(0.001),
                            metrics=['accuracy'])    
# Train
tbc=TensorBoardColab()
model.fit(x_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    verbose=1,
                    validation_data=(x_test, y_test), 
                    callbacks=[TensorBoardColabCallback(tbc)])

这是我看到的将 TensorBoard 与 Colab 结合使用的建议,参考此处:

但是,在添加回调时出现错误:

FailedPreconditionError: Error while reading resource variable conv_dw_8_2/depthwise_kernel from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/conv_dw_8_2/depthwise_kernel/N10tensorflow3VarE does not exist. [[Node: conv_dw_8_2/depthwise/ReadVariableOp = ReadVariableOpdtype=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]] [[Node: loss_2/mul/_147 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_6752_loss_2/mul", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

有人知道我做错了什么吗?这似乎是 运行 Colab 上 TensorBoard 的一种非常有用的方法,如果我能让它工作的话。

这是Keras版本冲突造成的。 Tensorboardcolab uses the full keras library while you import the tf.keras 实施 Keras API。因此,当您拟合模型时,您最终会使用两个不同版本的 keras。

您有几个选择:

使用 Keras 库并更改导入

import tensorflow as tf
import keras
from keras.datasets import cifar10
from keras.applications.mobilenet import MobileNet
from tensorboardcolab import TensorBoardColab, TensorBoardColabCallback

虽然代码在这些更改下运行良好,但您可以考虑使用 Keras's version of the Adam optimizer,因此您不再需要显式导入 tensorflow。

model.compile(loss='categorical_crossentropy', 
                    optimizer=keras.optimizers.Adam(lr=0.001), 
                    metrics=['accuracy'])`

使用 tf.keras 并修补 TensorBoardColab

如果您修补 callbacks.py and core.py 并修复那里的导入,您的代码运行良好:

<s>从 keras.callbacks 导入 TensorBoard</s> 来自 tensorflow.keras.callbacks 导入 TensorBoard

您也可以在我进行这些更改的地方使用 this fork