使用 itertools 在 keras 中定义损失函数
Define loss function in keras using itertools
我想定义一个损失函数,它代表隐藏层输出点之间的距离。首先,我在没有 keras
的情况下写了这个
import numpy as np
import itertools
pts = np.array([
[10,10,10],
[10,11,20],
[20,11,30],
[20,10,10],
[10,10,20],
])
diff = list(itertools.combinations(pts, 2))
ptdiff = lambda (p1,p2): (np.sqrt(np.sum((p1 - p2) ** 2)))
diffs = map(ptdiff, diff)
np.mean(diffs)
我得到了结果。我在keras中尝试了这个损失函数,z
是隐藏层的输出,它是一个矩阵
定义损失函数
def vae_loss(z):
z_diff = list(itertools.combinations(z,2))
ptdiff = lambda (p1,p2): (np.sqrt(np.sum((p1 - p2) ** 2)))
z_diffs = map(ptdiff, z_diff)
loss = K.mean(z_diffs)
return loss
但是它显示 TypeError: 'Tensor' object is not iterable.
,我只是想知道如何解决这个问题。
基于 very helpful question, you can make use of Keras' broadcasting properties. I'm assuming here you run Keras on the TensorFlow backend. From the TF docs广播:
A special case arises, and is also supported, where each of the input
arrays has a degenerate dimension at a different index. In this case,
the result is an "outer operation".
您的 numpy 代码的可重现示例如下:
import numpy as np
import itertools
# Generate 100 random points in a 5-D space
n_dim = 5
matrix = np.random.rand(1000, 5)
# List all possible combinations
combinations = list(itertools.combinations(matrix.tolist(), 2))
def mse(tup):
"""MSE between first and second element of a tuple of lists"""
return np.mean((np.array(tup[0]) - np.array(tup[1]))**2)
avg_mse = np.mean([mse(c) for c in combinations])
print('Average mse: {:.3f}'.format(avg_mse))
这个returns,就我而言,Average mse: 0.162
根据上面提到的问题,你可以构造你的损失函数如下:
import keras.backend as K
# Wrap our random matrix into a tensor
tensor = K.constant(value=matrix)
def loss_function(x):
x_ = K.expand_dims(tensor, axis=0)
x__ = K.expand_dims(tensor, axis=1)
# Compute mse for all combinations, making use of broadcasting
z = K.mean(K.square(x_ - x__), axis=-1)
# Return average mse
return(K.mean(z))
with K.get_session() as sess:
print('Average mse: {:.3f}'.format(loss_function(tensor).eval()))
哪个returns对我来说Average mse: 0.162
。
请注意,此实现并未完全复制您的 numpy 示例中的行为。不同之处在于,还考虑了行与自身的所有组合(itertools.combinations
不是这种情况)并且组合被考虑两次:mse((row1, row2))
和 mse((row2, row1))
都将被计算,这又不是您的 itertools
代码的情况。对于具有大量行的矩阵,这应该不会造成太大差异,如我的示例所示。
我想定义一个损失函数,它代表隐藏层输出点之间的距离。首先,我在没有 keras
的情况下写了这个import numpy as np
import itertools
pts = np.array([
[10,10,10],
[10,11,20],
[20,11,30],
[20,10,10],
[10,10,20],
])
diff = list(itertools.combinations(pts, 2))
ptdiff = lambda (p1,p2): (np.sqrt(np.sum((p1 - p2) ** 2)))
diffs = map(ptdiff, diff)
np.mean(diffs)
我得到了结果。我在keras中尝试了这个损失函数,z
是隐藏层的输出,它是一个矩阵
定义损失函数
def vae_loss(z):
z_diff = list(itertools.combinations(z,2))
ptdiff = lambda (p1,p2): (np.sqrt(np.sum((p1 - p2) ** 2)))
z_diffs = map(ptdiff, z_diff)
loss = K.mean(z_diffs)
return loss
但是它显示 TypeError: 'Tensor' object is not iterable.
,我只是想知道如何解决这个问题。
基于
A special case arises, and is also supported, where each of the input arrays has a degenerate dimension at a different index. In this case, the result is an "outer operation".
您的 numpy 代码的可重现示例如下:
import numpy as np
import itertools
# Generate 100 random points in a 5-D space
n_dim = 5
matrix = np.random.rand(1000, 5)
# List all possible combinations
combinations = list(itertools.combinations(matrix.tolist(), 2))
def mse(tup):
"""MSE between first and second element of a tuple of lists"""
return np.mean((np.array(tup[0]) - np.array(tup[1]))**2)
avg_mse = np.mean([mse(c) for c in combinations])
print('Average mse: {:.3f}'.format(avg_mse))
这个returns,就我而言,Average mse: 0.162
根据上面提到的问题,你可以构造你的损失函数如下:
import keras.backend as K
# Wrap our random matrix into a tensor
tensor = K.constant(value=matrix)
def loss_function(x):
x_ = K.expand_dims(tensor, axis=0)
x__ = K.expand_dims(tensor, axis=1)
# Compute mse for all combinations, making use of broadcasting
z = K.mean(K.square(x_ - x__), axis=-1)
# Return average mse
return(K.mean(z))
with K.get_session() as sess:
print('Average mse: {:.3f}'.format(loss_function(tensor).eval()))
哪个returns对我来说Average mse: 0.162
。
请注意,此实现并未完全复制您的 numpy 示例中的行为。不同之处在于,还考虑了行与自身的所有组合(itertools.combinations
不是这种情况)并且组合被考虑两次:mse((row1, row2))
和 mse((row2, row1))
都将被计算,这又不是您的 itertools
代码的情况。对于具有大量行的矩阵,这应该不会造成太大差异,如我的示例所示。