lightfm error: Not all estimated parameters are finite, your model may have diverged

Question

我是运行这个非常简单的代码：

def csr_values_analysis(values):
   num_zeros = 0
   num_ones = 0
   num_other = 0

   for v in values:
       if v == 0:
           num_zeros += 1
       elif v == 1:
           num_ones += 1
       else:
           num_other += 1

   return num_zeros, num_ones, num_other


print("Reading user_features.npz")
with open("/path/to/user_features.npz", "rb") as in_file:
    user_features_csr = sp.load_npz(in_file)
    print("User features read, shape: {}".format(user_features_csr.shape))
    print("Data values analysis: zeros: %i, ones: %i, other: %i" % csr_values_analysis(user_features_csr.data))

print("Reading item_features.npz")
with open("/path/to/item_features.npz", "rb") as in_file:
    item_features_csr = sp.load_npz(in_file)
    print("Item features read, shape: {}".format(item_features_csr.shape))
    print("Data values analysis: zeros: %i, ones: %i, other: %i" % csr_values_analysis(item_features_csr.data))

print("Reading interactions.npz")
with open("/path/to/interactions.npz", "rb") as in_file:
    interactions_csr = sp.load_npz(in_file)
    print("Interactions read, shape: {}".format(interactions_csr.shape))
    print("Data values analysis: zeros: %i, ones: %i, other: %i" % csr_values_analysis(interactions_csr.data))
    interactions_coo = interactions_csr.tocoo()

# Run lightfm

print("Running lightfm...")
model = LightFM(loss='warp')
model.fit(interactions_coo, user_features=user_features_csr, item_features=item_features_csr, epochs=20, num_threads=2, verbose=True)

具有以下输出：

Reading user_features.npz
User features read, shape: (827568, 105)
Data values analysis: zeros: 0, ones: 3153032, other: 0
Reading item_features.npz
Item features read, shape: (67339359, 36)
Data values analysis: zeros: 0, ones: 25259081, other: 0
Reading interactions.npz
Interactions read, shape: (827568, 67339359)
Data values analysis: zeros: 0, ones: 172388, other: 0
Running lightfm...
Epoch 0
Traceback (most recent call last):
  File "training.py", line 92, in <module>
    model.fit(interactions_coo, user_features=user_features_csr, item_features=item_features_csr, epochs=20, num_threads=2, verbose=True)
  File "/usr/lib64/python3.6/site-packages/lightfm/lightfm.py", line 479, in fit
    verbose=verbose)
  File "/usr/lib64/python3.6/site-packages/lightfm/lightfm.py", line 578, in fit_partial
    self._check_finite()
  File "/usr/lib64/python3.6/site-packages/lightfm/lightfm.py", line 413, in _check_finite
    raise ValueError("Not all estimated parameters are finite,"
ValueError: Not all estimated parameters are finite, your model may have diverged. Try decreasing the learning rate or normalising feature values and sample weights

我所有的 Scipy 稀疏矩阵都已归一化（即值是 0 或 1）。

我尝试更改学习计划和学习率但没有结果。

我已经检查过只有当我将项目特征添加到等式时才会发生这种情况。当运行 lightfm 仅具有交互，或交互 + 用户特征时没有错误。

据我所知，我已经安装了最新版本：

$ pip freeze | grep lightfm
lightfm==1.15

有什么想法吗？谢谢！

更新 1

我想知道我的稀疏矩阵是不是太稀疏了……不过，我尝试了极小的形状，但出现了同样的错误：

>>> import scipy.sparse as sp
>>> import numpy as np
>>> import lightfm
>>> uf_row = np.array([2,4,9])
>>> uf_col = np.array([4,9,3])
>>> uf_data = np.array([1,1,1])
>>> if_row = np.array([0,3])
>>> if_col = np.array([9,7])
>>> if_data = np.array([1,1])
>>> i_row = np.array([1])
>>> i_col = np.array([8])
>>> i_data = np.array([1])
>>> uf_csr = sp.csr_matrix((uf_data, (uf_row, uf_col)), shape=(10, 10))
>>> if_csr = sp.csr_matrix((if_data, (if_row, if_col)), shape=(10, 10))
>>> i_csr = sp.csr_matrix((i_data, (i_row, i_col)), shape=(10, 10))
>>> model = lightfm.LightFM(loss='warp')
>>> model.fit(i_csr.tocoo(), user_features=uf_csr, item_features=if_csr)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python3.6/site-packages/lightfm/lightfm.py", line 479, in fit
    verbose=verbose)
  File "/usr/lib64/python3.6/site-packages/lightfm/lightfm.py", line 578, in fit_partial
    self._check_finite()
  File "/usr/lib64/python3.6/site-packages/lightfm/lightfm.py", line 413, in _check_finite
    raise ValueError("Not all estimated parameters are finite,"
ValueError: Not all estimated parameters are finite, your model may have diverged. Try decreasing the learning rate or normalising feature values and sample weights

确实，我做错了什么...

更新 2

我想我发现了问题...我做了以下实验：

>>> uf_csr = sp.csr_matrix((np.array([1]),(np.array([0]), np.array([0]))),shape=(20,20))
>>> if_csr = sp.csr_matrix((np.array([1]),(np.array([0]), np.array([0]))),shape=(20,20))
>>> i_csr = sp.csr_matrix((np.array([1]),(np.array([1]), np.array([1]))),shape=(20,20))
>>> model = lightfm.LightFM(loss='warp')
>>> model.fit(i_csr.tocoo(), user_features=uf_csr, item_features=if_csr, epochs=20)
Traceback (most recent call last):
  ...
ValueError: Not all estimated parameters are finite, your model may have diverged. Try decreasing the learning rate or normalising feature values and sample weights

即我像往常一样有例外。现在，如果你观察交互矩阵，它有一个关于用户和项目的交互，在用户和项目特征矩阵中，它们的所有特征分别设置为 0。因此，让我们在用户特征矩阵中更改它，例如：

>>> uf_csr = sp.csr_matrix((np.array([1,1]),(np.array([0,1]), np.array([0,0]))),shape=(20,20))
>>> model.fit(i_csr.tocoo(), user_features=uf_csr, item_features=if_csr, epochs=20)
<lightfm.lightfm.LightFM object at 0x7f2d39ea3490>

瞧瞧！

我们可以对项目特征矩阵做同样的事情：

>>> uf_csr = sp.csr_matrix((np.array([1]),(np.array([0]), np.array([0]))),shape=(20,20))
>>> if_csr = sp.csr_matrix((np.array([1,1]),(np.array([0,1]), np.array([0,0]))),shape=(20,20))
>>> model.fit(i_csr.tocoo(), user_features=uf_csr, item_features=if_csr, epochs=20)
<lightfm.lightfm.LightFM object at 0x7f2d39ea3490>

所以，我会尝试找到一种方法来过滤与全零用户和项目功能相关的交互，我会 post ;)

Answer 1

正如我在上次更新中所解释的那样，问题在于用户和项目的所有功能都设置为零，并且同时发生与这些用户之一和这些项目之一相关的交互。

话虽如此，我的第一个想法是删除与这些用户和项目相关的交互，但这可能会影响对这些用户的推荐，或这些项目的推荐。

因此，一个不同的解决方案可能是使用对角矩阵扩展用户和项目特征矩阵，以便至少将这样的特征（用户 him/herself）设置为 1。

0 0 0        0 0 0 1 0 0
0 1 0   -->  0 1 0 0 1 0
1 0 0        1 0 0 0 0 1

lightfm error: Not all estimated parameters are finite, your model may have diverged

lightfm error: Not all estimated parameters are finite, your model may have diverged

python

sparse-matrix

warp

recommender-systems

lightfm