类似于点积的自定义keras层

Custom keras layer similar to dot product

我想制作一个自定义 Keras 层,其行为类似于点积,但不完全相同。我有一个形状的输入

(None, 10, 18, 32)

我想获得一个形状

(None, 18, 32)

我想要一个单一的权重向量(每行一个,所以形状是 (10, 1) 或 (1, 10))。然后乘以权重向量,使每一行乘以一个权重,然后相加。我对这一层的 objective 是为每一行分配一个权重,然后确定哪些行更重要(因此,将具有更大的权重)。

形状 10 是固定的(我有一个这样编码的特定矩阵),但形状 18 的维度取决于网络的拓扑结构。

我如何使用 Keras 编写代码?另外,我可以对这些权重添加限制吗?如果我可以强加这样的东西,我希望权重是非负的并且小于一。

编辑以分享我正在尝试做的事情。这是我创建的图层:

class Linear(keras.layers.Layer):
    def __init__(self, units=1, input_dim=10):
        super(Linear, self).__init__()
        w_init = tf.random_normal_initializer()
        self.w = tf.Variable(
            initial_value=w_init(shape=(units, input_dim), dtype="float32"),
            trainable=True,
        )

    def call(self, inputs):
        A = K.permute_dimensions(inputs,(0,2,1,3))
        A = K.dot(self.w, A)
        A = K.squeeze(A,0)
        return A

但我得到了这个摘要(我不明白如何使用我完成的代码在维度中获得 10,但是没关系)。

linear_24 (Linear)           (None, 10, 18, 32)        10        

当然,由于我做错了什么,我收到以下错误:

InvalidArgumentError:  Matrix size-incompatible: In[0]: [32,576], In[1]: [5760,100]
     [[node dense_9/MatMul (defined at /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:3009) ]] [Op:__inference_keras_scratch_graph_62482]

Function call stack:
keras_scratch_graph

我的模特:

from keras.models import Sequential
from keras.layers import Conv2D, Dropout, MaxPooling2D, Flatten, Dense, InputLayer, BatchNormalization
from keras.callbacks import LearningRateScheduler

n_outputs = 5
batch_size = 64


model = Sequential()
model.add(InputLayer(input_shape=(nbld, resolution, 1)))

model.add(Dropout(0.25))
model.add(Conv2D(filters=64, kernel_size=3, activation='relu', padding='same', input_shape=(nbld, resolution, 1)))
model.add(MaxPooling2D(pool_size=(1,2)))


model.add(Conv2D(filters=32, kernel_size=3, activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(1,2)))

model.add(Dropout(0.25))
model.add(Conv2D(filters=32, kernel_size=2, activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(1,2)))

model.add(BatchNormalization())

model.add(CustomLayer())
model.add(Flatten())
model.add(Dense(100, activation='relu'))
model.add(Dense(n_outputs, activation='softmax', name="visualized_layer"))
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])

model.summary()
change_lr = LearningRateScheduler(scheduler)
history = model.fit(x=X_train, y=y_train, epochs=10, validation_data=(X_test, y_test), callbacks=[change_lr])

模型总结:

Model: "sequential_29"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dropout_85 (Dropout)         (None, 10, 150, 1)        0         
_________________________________________________________________
conv2d_85 (Conv2D)           (None, 10, 150, 64)       640       
_________________________________________________________________
max_pooling2d_85 (MaxPooling (None, 10, 75, 64)        0         
_________________________________________________________________
conv2d_86 (Conv2D)           (None, 10, 75, 32)        18464     
_________________________________________________________________
max_pooling2d_86 (MaxPooling (None, 10, 37, 32)        0         
_________________________________________________________________
dropout_86 (Dropout)         (None, 10, 37, 32)        0         
_________________________________________________________________
conv2d_87 (Conv2D)           (None, 10, 37, 32)        4128      
_________________________________________________________________
max_pooling2d_87 (MaxPooling (None, 10, 18, 32)        0         
_________________________________________________________________
batch_normalization_29 (Batc (None, 10, 18, 32)        128       
_________________________________________________________________
custom_layer_2 (CustomLayer) (None, 10, 18, 32)        10        
_________________________________________________________________
flatten_11 (Flatten)         (None, 5760)              0         
_________________________________________________________________
dense_11 (Dense)             (None, 100)               576100    
_________________________________________________________________
visualized_layer (Dense)     (None, 5)                 505       
=================================================================
Total params: 599,975
Trainable params: 599,911
Non-trainable params: 64

这里有一个简单的自定义层的可能性

class CustomLayer(Layer):
    
    def __init__(self):
        super(CustomLayer,self).__init__()
        
    def build(self, input_shape):
        
        self.W=self.add_weight(name="custom_weight", shape=(1,input_shape[1],1,1),
                               initializer="normal")
        
    def call(self, x):
        
        x = tf.nn.softmax(self.W, axis=1)*x 
        # apply a softmax if u want them non-negative and less than one otherwise ignore or change it
        x = tf.reduce_sum(x, axis=1)
                
        return x

图层的工作原理

batch_dim = 32
X = np.random.uniform(0,1, (batch_dim,10,18,32)).astype(np.float32)

CustomLayer()(X).shape # (batch_dim, 18, 32)

模型中的示例

X = np.random.uniform(0,1, (5,10,18,3))
y = np.random.uniform(0,1, 5)

inp = Input((10,18,3))
x = Conv2D(32, 3, padding='same', activation='relu')(inp)
x = CustomLayer()(x)
x = Flatten()(x)
out = Dense(1)(x)

m = Model(inp, out)
m.compile('adam','mse')
m.fit(X,y, epochs=3)

# get the weights
tf.nn.softmax(m.get_weights()[-3], axis=1)