处理 python 中参数化 sigmoid 函数的错误

Question

我正在尝试将一组数字转换为 sigmoid：

actualarray = {
    'open_cost_1':{
        'cost_matrix': [
            {'a': 24,'b': 56,'c': 78},
            {'a': 3,'b': 98,'c':1711},
            {'a': 121,'b': 12121,'c': 12989121},
        ]
    },
    'open_cost_2':{
        'cost_matrix': [
            {'a': 123,'b': 1312,'c': 1231},
            {'a': 1011,'b': 1911,'c':911},
            {'a': 1433,'b': 19829,'c': 1132},
        ]
    }
}

其中每个 cost_matrix 中每个字典列表中的每个数字都通过不同的 sigmoid 函数进行归一化：

def apply_normalizations(costs):

    def sigmoid(b,m,v):
        return ((np.exp(b+m*v) / (1 + np.exp(b+m*v)))*2)-1 #Taken from http://web.stanford.edu/class/psych252/tutorials/Tutorial_LogisticRegression.html

    def normalize_dicts_local_sigmoid(bias, slope,lst):
        return [{key: sigmoid(bias, slope,val) for key,val in dic.iteritems()} for dic in lst]


    for name, value in costs.items():
        if int((name.split("_")[-1]))>1:
            value['normalised_matrix_sigmoid'] = normalize_dicts_local_sigmoid(0,1,value['cost_matrix'])


 apply_normalizations(actualarray)

然而，当我运行这个时，我得到：

 RuntimeWarning: overflow encountered in exp
  return ((np.exp(b+m*v) / (1 + np.exp(b+m*v)))*2)-1
 RuntimeWarning: invalid value encountered in double_scalars
  return ((np.exp(b+m*v) / (1 + np.exp(b+m*v)))*2)-1

数组变为：

{
    'open_cost_2': {
        'cost_matrix': [
            {
                'a': 123,
                'c': 1231,
                'b': 1312
            },
            {
                'a': 1011,
                'c': 911,
                'b': 1911
            },
            {
                'a': 1433,
                'c': 1132,
                'b': 19829
            }
        ],
        'normalised_matrix_sigmoid': [
            {
                'a': 1.0,
                'c': nan,
                'b': nan
            },
            {
                'a': nan,
                'c': nan,
                'b': nan
            },
            {
                'a': nan,
                'c': nan,
                'b': nan
            }
        ]
    },
    'open_cost_1': {
        'cost_matrix': [
            {
                'a': 24,
                'c': 78,
                'b': 56
            },
            {
                'a': 3,
                'c': 1711,
                'b': 98
            },
            {
                'a': 121,
                'c': 12989121,
                'b': 12121
            }
        ]
    }
}

注意，每个成本总是大于 0，因此我在 sigmoid 函数中乘以 2 并减去 1。

我怎样才能使它不出现这个错误？

Answer 1

如警告所述，您实施的 S 型函数中的指数溢出。发生这种情况时，函数 returns nan:

In [3]: sigmoid(1000, 1, 1)
/Users/warren/miniconda3/bin/ipython:2: RuntimeWarning: overflow encountered in exp
  if __name__ == '__main__':
/Users/warren/miniconda3/bin/ipython:2: RuntimeWarning: invalid value encountered in double_scalars
  if __name__ == '__main__':
Out[3]: nan

您可以使用 scipy.special.expit 而不是根据 exp 编写 sigmoid 函数。它可以正确处理非常大的参数。

In [5]: from scipy.special import expit

In [6]: def mysigmoid(b, m, v):
   ...:     return expit(b + m*v)*2 - 1
   ...: 

In [7]: mysigmoid(1000, 1, 1)
Out[7]: 1.0

检查它 returns 在不溢出的情况下是否与您的 sigmoid 函数相同：

In [8]: sigmoid(1, 2, 3)
Out[8]: 0.99817789761119879

In [9]: mysigmoid(1, 2, 3)
Out[9]: 0.99817789761119879

请参阅 Numpy Pure Functions for performance, caching 我对另一个关于 S 型函数的问题的回答。

处理 python 中参数化 sigmoid 函数的错误

Deal with errors in parametrised sigmoid function in python

python

numpy

logistic-regression

sigmoid