无法从论文 (PyTorch) 实现 "concurrent" softmax 函数
Trouble implementing "concurrent" softmax function from paper (PyTorch)
我正在尝试实现论文“Large-Scale Object Detection in the Wild from Imbalanced Multi-Labels”中给出的所谓 'concurrent' softmax 函数。下面是并发softmax的定义:
NOTE: I have left the (1-rij) term out for the time being because I don't think it applies to my problem given that my training dataset has a different type of labeling compared to the paper.
为了让我自己保持简单,我首先以一种非常低效但易于理解的方式使用 for 循环来实现它。但是,我得到的输出对我来说似乎是错误的。下面是我使用的代码:
# here is a one-hot encoded vector for the multi-label classification
# the image thus has 2 correct labels out of a possible 3 classes
y = [0, 1, 1]
# these are some made up logits that might come from the network.
vec = torch.tensor([0.2, 0.9, 0.7])
def concurrent_softmax(vec, y):
for i in range(len(vec)):
zi = torch.exp(vec[i])
sum_over_j = 0
for j in range(len(y)):
sum_over_j += (1-y[j])*torch.exp(vec[j])
out = zi / (sum_over_j + zi)
yield out
for result in concurrent_softmax(vec, y):
print(result)
从这个实现中我意识到,无论我给 'vec' 中的第一个 logit 赋予什么值,我总是会得到 0.5 的输出(因为它基本上总是计算 zi / (zi+zi) ).这似乎是一个主要问题,因为我预计 logits 的值会对生成的并发 softmax 值产生一些影响。那么我的实现是否有问题,或者函数的这种行为是否正确,理论上有什么我不理解的?
这是给定 y[i]=1
所有其他 i.
的预期行为
请注意,您可以使用点积来简化求和:
y = torch.tensor(y)
def concurrent_softmax(z, y):
sum_over_j = torch.dot((torch.ones(len(y)) - y), torch.exp(z))
for zi in z:
numerator = torch.exp(zi)
denominator = sum_over_j + numerator
yield numerator / denominator
我正在尝试实现论文“Large-Scale Object Detection in the Wild from Imbalanced Multi-Labels”中给出的所谓 'concurrent' softmax 函数。下面是并发softmax的定义:
NOTE: I have left the (1-rij) term out for the time being because I don't think it applies to my problem given that my training dataset has a different type of labeling compared to the paper.
为了让我自己保持简单,我首先以一种非常低效但易于理解的方式使用 for 循环来实现它。但是,我得到的输出对我来说似乎是错误的。下面是我使用的代码:
# here is a one-hot encoded vector for the multi-label classification
# the image thus has 2 correct labels out of a possible 3 classes
y = [0, 1, 1]
# these are some made up logits that might come from the network.
vec = torch.tensor([0.2, 0.9, 0.7])
def concurrent_softmax(vec, y):
for i in range(len(vec)):
zi = torch.exp(vec[i])
sum_over_j = 0
for j in range(len(y)):
sum_over_j += (1-y[j])*torch.exp(vec[j])
out = zi / (sum_over_j + zi)
yield out
for result in concurrent_softmax(vec, y):
print(result)
从这个实现中我意识到,无论我给 'vec' 中的第一个 logit 赋予什么值,我总是会得到 0.5 的输出(因为它基本上总是计算 zi / (zi+zi) ).这似乎是一个主要问题,因为我预计 logits 的值会对生成的并发 softmax 值产生一些影响。那么我的实现是否有问题,或者函数的这种行为是否正确,理论上有什么我不理解的?
这是给定 y[i]=1
所有其他 i.
请注意,您可以使用点积来简化求和:
y = torch.tensor(y)
def concurrent_softmax(z, y):
sum_over_j = torch.dot((torch.ones(len(y)) - y), torch.exp(z))
for zi in z:
numerator = torch.exp(zi)
denominator = sum_over_j + numerator
yield numerator / denominator