tensor.shape return 使用 tf.keras 时的 None 值列表

tensor.shape return a list of None values when using tf.keras

我有一个接受张量的函数,并使用如下所示的方程从该张量的形状计算 num_classes 变量:

num_classes = tensor.shape[4] - 5.

现在,如果我通过随机输入独立调用此函数,它工作得很好,但由于此函数是计算某些指标的逻辑的一部分,同时在每个时期后根据验证数据运行模型,它会失败并输出此错误:

File "train.py", line 142, in <module>
    main()
  File "train.py", line 120, in main
    train(input_size,
  File "train.py", line 81, in train
    face_detector.fit(train_data_generator ,
  File "/home/yogeesh/yogeesh/tf2/lib/python3.8/site-packages/keras/engine/training.py", line 1215, in fit
    val_logs = self.evaluate(
  File "/home/yogeesh/yogeesh/tf2/lib/python3.8/site-packages/keras/engine/training.py", line 1501, in evaluate
    tmp_logs = self.test_function(iterator)
  File "/home/yogeesh/yogeesh/tf2/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 885, in __call__
    result = self._call(*args, **kwds)
  File "/home/yogeesh/yogeesh/tf2/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 933, in _call
    self._initialize(args, kwds, add_initializers_to=initializers)
  File "/home/yogeesh/yogeesh/tf2/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 759, in _initialize
    self._stateful_fn._get_concrete_function_internal_garbage_collected(  # pylint: disable=protected-access
  File "/home/yogeesh/yogeesh/tf2/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 3066, in _get_concrete_function_internal_garbage_collected
    graph_function, _ = self._maybe_define_function(args, kwargs)
  File "/home/yogeesh/yogeesh/tf2/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 3463, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "/home/yogeesh/yogeesh/tf2/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 3298, in _create_graph_function
    func_graph_module.func_graph_from_py_func(
  File "/home/yogeesh/yogeesh/tf2/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py", line 1007, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "/home/yogeesh/yogeesh/tf2/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 668, in wrapped_fn
    out = weak_wrapped_fn().__wrapped__(*args, **kwds)
  File "/home/yogeesh/yogeesh/tf2/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py", line 994, in wrapper
    raise e.ag_error_metadata.to_exception(e)
TypeError: in user code

    /home/yogeesh/yogeesh/tf2/lib/python3.8/site-packages/keras/engine/training.py:1330 test_function  *
        return step_function(self, iterator)
    /home/yogeesh/yogeesh/object_detection/Yolov3_tf2/metrics/mAP.py:102 update_state  *
        box_objects = tf_postprocessing.post_process(predictions ,
    /home/yogeesh/yogeesh/object_detection/Yolov3_tf2/postprocessing/tf_postprocessing.py:137 post_process  *
        all_gt = modify_locs(ground_truth , scale_anchors , gt = True)
    /home/yogeesh/yogeesh/object_detection/Yolov3_tf2/postprocessing/tf_postprocessing.py:35 modify_locs  *
        modified_loc = pp_utils.modify_locs_util(localizations , this_scale_anchor , ground_truth = gt)
    /home/yogeesh/yogeesh/object_detection/Yolov3_tf2/postprocessing/tf_utils.py:20 modify_locs_util  *
        num_classes = localizations.shape[4] - 5

    TypeError: unsupported operand type(s) for -: 'NoneType' and 'int'

2021-10-21 18:33:00.783103: W tensorflow/core/kernels/data/generator_dataset_op.cc:107] Error occurred when finalizing GeneratorDataset iterator: Failed precondition: Python interpreter state is not initialized. The process may be terminated.
     [[{{node PyFunc}}]]

这仅在训练时发生,使用 tf.keras.model.fit 函数,在“评估”调用中运行覆盖的 test_Step 函数。

这是我出错的函数。

def modify_locs_util(localizations , anchors , img_shape = [416, 416] , ground_truth = False):
    # localizations.shape : [batch_size , grid_size , grid_size , 3 , 7] (for this dataset)
    #  where grid_size can be 13,26,52 (Yolov3 model).
    locs_shape = tf.shape(localizations)
    grid_shape = locs_shape[1:3]
    num_anchors = locs_shape[3]
    num_classes = locs_shape[4] - 5
    strides = [img_shape[0] // grid_shape[0], img_shape[1] // grid_shape[1]]
    cell_grid = comman_utils.gen_cell_grid(grid_shape[0] , grid_shape[1] , num_anchors)

奇怪的是,如果我打印本地化的形状,结果会是这样的:

(None, 13, 13, 3, 7)
(None, 26, 26, 3, 7)
(None, 52, 52, 3, 7)
(None, None, None, None, None)

如您所见,前 3 次形状很好,但我不知道为什么它再次调用(它应该只调用此函数 3 次)现在它指的是所有形状 None的。它几乎就像它首先进行架构检查以找出形状,但即便如此,静态暗淡也不应该是 None.

也许试试下面的代码:

num_classes = tf.shape(tensor)[4] - 5.

将在调用 fit(*) 方法时计算。