当图像以 0 到 9 的小数评级时，Earth Mover Loss 的输入类型应该是什么（Keras，Tensorflow）

Question

我正在尝试通过 Google 实施 NIMA 研究论文，他们对图像质量进行评分。我使用的是 TID2013 数据集。我有 3000 张图片，每张图片的分数从 0.00 到 9.00

df.head()
>>
Image Name          Score
0   I01_01_1.bmp    5.51429
1   i01_01_2.bmp    5.56757
2   i01_01_3.bmp    4.94444
3   i01_01_4.bmp    4.37838
4   i01_01_5.bmp    3.86486

我找到下面给出的损失函数代码

def earth_mover_loss(y_true, y_pred):
    cdf_true = K.cumsum(y_true, axis=-1)
    cdf_pred = K.cumsum(y_pred, axis=-1)
    emd = K.sqrt(K.mean(K.square(cdf_true - cdf_pred), axis=-1))
    return K.mean(emd)

并且我将构建模型的代码编写为：

base_model = InceptionResNetV2(input_shape=(W,H, 3),include_top=False,pooling='avg',weights='imagenet')
for layer in base_model.layers: 
    layer.trainable = False

x = Dropout(0.45)(base_model.output)
out = Dense(10, activation='softmax')(x) # there are 10 classes

model = Model(base_model.input, out)
optimizer = Adam(lr=0.001)
model.compile(optimizer,loss=earth_mover_loss,)

问题：当我将 ImageDataGenerator 用作：

gen=ImageDataGenerator(validation_split=0.15,preprocessing_function=preprocess_input)

train = gen.flow_from_dataframe(df,TRAIN_PATH,x_col='Image Name',y_col='Score',subset='training',class_mode='sparse')

val = gen.flow_from_dataframe(df,TRAIN_PATH,x_col='Image Name',y_col='Score',subset='validation',class_mode='sparse')

它要么在训练期间给出错误，要么给出 nan

的损失值

我试过几种方法：

将分数创建为 rounded = math.round(score) 并使用 class_mode=sparse
将分数创建为 str(rounded)，然后使用 class_mode=categorical

但我每次都出错。

请帮助我使用 ImageDataGenerator 加载图像，了解我应该如何将图像加载到此模型中 。

模型结构不应改变。

Answer 1

根据介绍的内容 , I have a couple of ideas about the ...

我认为您的损失是 nan，因为 sqrt 是根据不允许的负数计算的。所以有两种可能：

在应用 sqrt 之前剪切值。通过这种方式，我们剪辑了所有 <= 0 的值，用一个小的 epsilon

代替它们

def earth_mover_loss(y_true, y_pred):
    cdf_true = K.clip(K.cumsum(y_true, axis=-1), 0,1)
    cdf_pred = K.clip(K.cumsum(y_pred, axis=-1), 0,1)
    emd = K.sqrt(K.maximum(K.mean(K.square(cdf_true - cdf_pred), axis=-1), K.epsilon()))
    return K.mean(emd)

排除 sqrt，这样 Earth Mover Loss 更类似于 CDF 之间的 MSE

def earth_mover_loss(y_true, y_pred):
    cdf_true = K.clip(K.cumsum(y_true, axis=-1), 0,1)
    cdf_pred = K.clip(K.cumsum(y_pred, axis=-1), 0,1)
    emd = K.mean(K.square(cdf_true - cdf_pred), axis=-1)
    return K.mean(emd)

当图像以 0 到 9 的小数评级时，Earth Mover Loss 的输入类型应该是什么（Keras，Tensorflow）

What should be the Input types for Earth Mover Loss when images are rated in decimals from 0 to 9 (Keras, Tensorflow)

python

image-recognition

deep-learning

keras

tensorflow