lightfm error: Not all estimated parameters are finite, your model may have diverged

lightfm error: Not all estimated parameters are finite, your model may have diverged

我是运行这个非常简单的代码:

def csr_values_analysis(values):
   num_zeros = 0
   num_ones = 0
   num_other = 0

   for v in values:
       if v == 0:
           num_zeros += 1
       elif v == 1:
           num_ones += 1
       else:
           num_other += 1

   return num_zeros, num_ones, num_other


print("Reading user_features.npz")
with open("/path/to/user_features.npz", "rb") as in_file:
    user_features_csr = sp.load_npz(in_file)
    print("User features read, shape: {}".format(user_features_csr.shape))
    print("Data values analysis: zeros: %i, ones: %i, other: %i" % csr_values_analysis(user_features_csr.data))

print("Reading item_features.npz")
with open("/path/to/item_features.npz", "rb") as in_file:
    item_features_csr = sp.load_npz(in_file)
    print("Item features read, shape: {}".format(item_features_csr.shape))
    print("Data values analysis: zeros: %i, ones: %i, other: %i" % csr_values_analysis(item_features_csr.data))

print("Reading interactions.npz")
with open("/path/to/interactions.npz", "rb") as in_file:
    interactions_csr = sp.load_npz(in_file)
    print("Interactions read, shape: {}".format(interactions_csr.shape))
    print("Data values analysis: zeros: %i, ones: %i, other: %i" % csr_values_analysis(interactions_csr.data))
    interactions_coo = interactions_csr.tocoo()

# Run lightfm

print("Running lightfm...")
model = LightFM(loss='warp')
model.fit(interactions_coo, user_features=user_features_csr, item_features=item_features_csr, epochs=20, num_threads=2, verbose=True)

具有以下输出:

Reading user_features.npz
User features read, shape: (827568, 105)
Data values analysis: zeros: 0, ones: 3153032, other: 0
Reading item_features.npz
Item features read, shape: (67339359, 36)
Data values analysis: zeros: 0, ones: 25259081, other: 0
Reading interactions.npz
Interactions read, shape: (827568, 67339359)
Data values analysis: zeros: 0, ones: 172388, other: 0
Running lightfm...
Epoch 0
Traceback (most recent call last):
  File "training.py", line 92, in <module>
    model.fit(interactions_coo, user_features=user_features_csr, item_features=item_features_csr, epochs=20, num_threads=2, verbose=True)
  File "/usr/lib64/python3.6/site-packages/lightfm/lightfm.py", line 479, in fit
    verbose=verbose)
  File "/usr/lib64/python3.6/site-packages/lightfm/lightfm.py", line 578, in fit_partial
    self._check_finite()
  File "/usr/lib64/python3.6/site-packages/lightfm/lightfm.py", line 413, in _check_finite
    raise ValueError("Not all estimated parameters are finite,"
ValueError: Not all estimated parameters are finite, your model may have diverged. Try decreasing the learning rate or normalising feature values and sample weights

我所有的 Scipy 稀疏矩阵都已归一化(即值是 01)。

我尝试更改学习计划和学习率但没有结果。

我已经检查过只有当我将项目特征添加到等式时才会发生这种情况。当 运行 lightfm 仅具有交互,或交互 + 用户特征时没有错误。

据我所知,我已经安装了最新版本:

$ pip freeze | grep lightfm
lightfm==1.15

有什么想法吗?谢谢!

更新 1

我想知道我的稀疏矩阵是不是太稀疏了……不过,我尝试了极小的形状,但出现了同样的错误:

>>> import scipy.sparse as sp
>>> import numpy as np
>>> import lightfm
>>> uf_row = np.array([2,4,9])
>>> uf_col = np.array([4,9,3])
>>> uf_data = np.array([1,1,1])
>>> if_row = np.array([0,3])
>>> if_col = np.array([9,7])
>>> if_data = np.array([1,1])
>>> i_row = np.array([1])
>>> i_col = np.array([8])
>>> i_data = np.array([1])
>>> uf_csr = sp.csr_matrix((uf_data, (uf_row, uf_col)), shape=(10, 10))
>>> if_csr = sp.csr_matrix((if_data, (if_row, if_col)), shape=(10, 10))
>>> i_csr = sp.csr_matrix((i_data, (i_row, i_col)), shape=(10, 10))
>>> model = lightfm.LightFM(loss='warp')
>>> model.fit(i_csr.tocoo(), user_features=uf_csr, item_features=if_csr)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python3.6/site-packages/lightfm/lightfm.py", line 479, in fit
    verbose=verbose)
  File "/usr/lib64/python3.6/site-packages/lightfm/lightfm.py", line 578, in fit_partial
    self._check_finite()
  File "/usr/lib64/python3.6/site-packages/lightfm/lightfm.py", line 413, in _check_finite
    raise ValueError("Not all estimated parameters are finite,"
ValueError: Not all estimated parameters are finite, your model may have diverged. Try decreasing the learning rate or normalising feature values and sample weights

确实,我做错了什么...

更新 2

我想我发现了问题...我做了以下实验:

>>> uf_csr = sp.csr_matrix((np.array([1]),(np.array([0]), np.array([0]))),shape=(20,20))
>>> if_csr = sp.csr_matrix((np.array([1]),(np.array([0]), np.array([0]))),shape=(20,20))
>>> i_csr = sp.csr_matrix((np.array([1]),(np.array([1]), np.array([1]))),shape=(20,20))
>>> model = lightfm.LightFM(loss='warp')
>>> model.fit(i_csr.tocoo(), user_features=uf_csr, item_features=if_csr, epochs=20)
Traceback (most recent call last):
  ...
ValueError: Not all estimated parameters are finite, your model may have diverged. Try decreasing the learning rate or normalising feature values and sample weights

即我像往常一样有例外。现在,如果你观察交互矩阵,它有一个关于用户和项目的交互,在用户和项目特征矩阵中,它们的所有特征分别设置为 0。因此,让我们在用户特征矩阵中更改它,例如:

>>> uf_csr = sp.csr_matrix((np.array([1,1]),(np.array([0,1]), np.array([0,0]))),shape=(20,20))
>>> model.fit(i_csr.tocoo(), user_features=uf_csr, item_features=if_csr, epochs=20)
<lightfm.lightfm.LightFM object at 0x7f2d39ea3490>

瞧瞧!

我们可以对项目特征矩阵做同样的事情:

>>> uf_csr = sp.csr_matrix((np.array([1]),(np.array([0]), np.array([0]))),shape=(20,20))
>>> if_csr = sp.csr_matrix((np.array([1,1]),(np.array([0,1]), np.array([0,0]))),shape=(20,20))
>>> model.fit(i_csr.tocoo(), user_features=uf_csr, item_features=if_csr, epochs=20)
<lightfm.lightfm.LightFM object at 0x7f2d39ea3490>

所以,我会尝试找到一种方法来过滤与全零用户和项目功能相关的交互,我会 post ;)

正如我在上次更新中所解释的那样,问题在于用户和项目的所有功能都设置为零,并且同时发生与这些用户之一和这些项目之一相关的交互。

话虽如此,我的第一个想法是删除与这些用户和项目相关的交互,但这可能会影响对这些用户的推荐,或这些项目的推荐。

因此,一个不同的解决方案可能是使用对角矩阵扩展用户和项目特征矩阵,以便至少将这样的特征(用户 him/herself)设置为 1。

0 0 0        0 0 0 1 0 0
0 1 0   -->  0 1 0 0 1 0
1 0 0        1 0 0 0 0 1