Tensorflow Embedding 层后跟 Dense 给出形状错误

Tensorflow Embedding layer followed by Dense gives shape error

实际上,我正在尝试使用 keras 将这个旧的 Tensorflow 1 代码转换为 TF2:

def init_net(batch_size=256, num_feats=30, hidden_size=100):
  with tf.name_scope('network'):
    with tf.name_scope('inputs'):
        
        inputs = tf.placeholder(tf.int32, shape=[batch_size, ], name='inputs')
        labels = tf.placeholder(tf.int32, shape=[batch_size, ], name='labels')

        embeddings = tf.Variable(
            tf.random_uniform([len(NODE_MAP), num_feats]), name='embeddings'
        )

        embed = tf.nn.embedding_lookup(embeddings, inputs)
        onehot_labels = tf.one_hot(labels, len(NODE_MAP), dtype=tf.float32)

    with tf.name_scope('hidden'):
        weights = tf.Variable(
            tf.truncated_normal(
                [num_feats, hidden_size], stddev=1.0 / math.sqrt(num_feats)
            ),
            name='weights'
        )

        biases = tf.Variable(
            tf.zeros((hidden_size,)),
            name='biases'
        )

        hidden = tf.tanh(tf.matmul(embed, weights) + biases)

    with tf.name_scope('softmax'):
        weights = tf.Variable(
            tf.truncated_normal(
                [hidden_size, len(NODE_MAP)],
                stddev=1.0 / math.sqrt(hidden_size)
            ),
            name='weights'
        )
        biases = tf.Variable(
            tf.zeros((len(NODE_MAP),), name='biases')
        )

        logits = tf.matmul(hidden, weights) + biases

    with tf.name_scope('error'):
        cross_entropy = tf.nn.softmax_cross_entropy_with_logits_v2(
            labels=onehot_labels, logits=logits, name='cross_entropy'
        )

        loss = tf.reduce_mean(cross_entropy, name='cross_entropy_mean')

return inputs, labels, embeddings, loss

NODE_MAP这里是词汇表。这个网络应该可以学习一门编程语言。我的版本如下:

network = tf.keras.models.Sequential()

embedding_layer = tf.keras.layers.Embedding(input_dim=len(NODE_MAP), 
                                        output_dim=30,
                                        input_length=256,
                                        embeddings_initializer=tf.keras.initializers.RandomUniform())
network.add(embedding_layer)

hidden_layer = tf.keras.layers.Dense(100, activation='tanh')
network.add(hidden_layer)

softmax_layer = tf.keras.layers.Softmax()
network.add(softmax_layer)

network.compile(optimizer='SGD', loss='categorical_crossentropy')

但是此代码引发“ValueError:Shapes (None, 256) and (None, 256, 100) are incompatible”错误。 如果我在 Embedding 层和 Dense 层之间添加一个额外的 Flatten 层,错误将变为“ValueError:Shapes (None, 256) and (None, 100) are incompatible”。然后,如果我将密集层中的单元数从 100 更改为 256,网络将开始工作,但不会学习(训练过程不会提高准确性)。 我错过了什么?

将损失函数改为sparce_cathegorical_entropy:

network.compile(optimizer='SGD', loss='sparce_categorical_crossentropy')