防止 scipy 特征向量因计算机而异

Question

跟进关于如何找到马尔可夫稳态的问题，我现在运行正在研究它在我的实验室计算机上完美运行的问题，但它不起作用在任何其他计算机上。具体来说，它总能找到正确数量的接近一的特征值，从而找到哪些节点是吸引子节点，但它并不能始终如一地找到所有这些特征值，而且它们也没有正确分组。例如，使用下面的 64x64 转换矩阵，它在其中不起作用的计算机总是随机产生三种不同的错误集合之一。在下面较小的矩阵 M1 上，所有测试的计算机都得到相同的、正确的吸引子组和平稳分布结果。

所有测试的机器都是运行ning Win7x64 和 WinPython-64bit-2.7.9.4。一台电脑总是正确的，其他三台电脑总是以同样的方式出错。根据我发现的几篇文章 like this and this，这听起来可能是由计算的浮点精度差异引起的。不幸的是我不知道如何解决这个问题；我的意思是，我不知道如何更改从矩阵中提取左特征值的代码，以强制达到所有计算机都可以处理的特定精度（而且我认为为此目的它不必非常准确） .

这只是我目前对结果可能有何不同的最佳猜测。如果您对为什么会发生这种情况以及如何阻止它发生有更好的了解，那也很好。

如果有办法使 scipy 从运行到运行和计算机到计算机保持一致，我认为这不会取决于我的细节方法，但因为它是被请求的，所以它就在这里。两个矩阵都有 3 个吸引子。在 M1 中，第一个 [1,2] 是两个状态的轨道，另外两个 [7] 和 [8] 是平衡态。 M2 是一个 64x64 transition matrix，在 [2] 和 [26] 处具有平衡，并且轨道使用 [7,8]。

但它有时会报告 [[26],[2],[26]]，有时会报告 [[2,7,8,26],[2],[26]，而不是找到那组吸引子] 有时...每个运行都不会得到相同的答案，而且它永远不会得到 [[2]、[7,8]、[26]]（以任何顺序）。

import numpy as np
import scipy.linalg

M1 = np.array([[0.2, 0.8, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
              [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
              [0.6, 0.4, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
              [0.0, 0.0, 0.2, 0.0, 0.1, 0.1, 0.3, 0.3],
              [0.0, 0.0, 0.2, 0.2, 0.2, 0.2, 0.1, 0.1],
              [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.5, 0.5],
              [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0],
              [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0]])

M2 = np.genfromtxt('transitionMatrix1.csv', delimiter=',')

# For easy switching
M = M2
# Confirm the matrix is a valid Markov transition matrix
#print np.sum(M,axis=1)

其余的代码与上一个问题相同，为方便起见，包含在此处。

#create a list of the left eigenvalues and a separate array of the left eigenvectors
theEigenvalues, leftEigenvectors = scipy.linalg.eig(M, right=False, left=True)  
# for stationary distribution the eigenvalues and vectors are always real, and this speeds it up a bit
theEigenvalues = theEigenvalues.real                 
#print theEigenvalues 
leftEigenvectors = leftEigenvectors.real
#print leftEigenvectors 
# set how close to zero is acceptable as being zero...1e-15 was too low to find one of the actual eigenvalues
tolerance = 1e-10
# create a filter to collect the eigenvalues that are near enough to zero                                    
mask = abs(theEigenvalues - 1) < tolerance
# apply that filter           
theEigenvalues = theEigenvalues[mask]
# filter out the eigenvectors with non-zero eigenvalues                
leftEigenvectors = leftEigenvectors[:, mask]         
# convert all the tiny and negative values to zero to isolate the actual stationary distributions
leftEigenvectors[leftEigenvectors < tolerance] = 0   
# normalize each distribution by the sum of the eigenvector columns
attractorDistributions = leftEigenvectors / leftEigenvectors.sum(axis=0, keepdims=True)   
# this checks that the vectors are actually the left eigenvectors
attractorDistributions = np.dot(attractorDistributions.T, M).T      
# convert the column vectors into row vectors (lists) for each attractor     
attractorDistributions = attractorDistributions.T                        
print attractorDistributions
# a list of the states in any attractor with the stationary distribution within THAT attractor
#theSteadyStates = np.sum(attractorDistributions, axis=1)                
#print theSteadyStates

Answer 1

不幸的答案是无法修复 scipy 的种子，因此无法强制它输出一致的值。这也意味着它无法可靠地生成 正确的 答案，因为只有一个答案是正确的。我试图从 scipy 人那里得到明确的答案或修复是 completely dismissed，但有人在面对这个问题时可能会从这些话中找到一些智慧。

作为该问题的一个具体示例，当您运行上面的代码时，您 有时可能会得到 以下一组特征向量，据称代表了每个特征向量的稳态系统中的吸引子。我的家用电脑总是产生这个结果（这与我的笔记本电脑和实验室电脑不同）。如问题中所述，正确的吸引子是 [[2],[7,8],[26]]。 [2] 和 [6] 的平衡被正确识别，但是 [7,8] 的分布而不是 returns non-valid 在 [2,26] 上的概率分布。正确答案分别是 [0.19835, 0.80164] 超过 [7,8]。我的实验室计算机正确地找到了该解决方案，但到目前为止，其他六台计算机未能找到该解决方案。

这意味着（除非我的代码中有其他未识别的错误）scipy.linalg 对于寻找马尔可夫模型的稳定状态毫无价值。 即使它在一些的时间工作，不能依赖它来提供正确的答案，因此应该完全避免......至少对于马尔可夫模型稳态，并且可能对于所有要做的事情与特征向量。就是不行。

如果有人提出问题，我将post编码如何可靠地生成马尔可夫模型的平稳分布而不使用scipy。它运行有点慢，但它始终相同且始终正确。

[[ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.25707958  1.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.06867772  0.          1.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
 [ 0.          0.          0.        ]
...
 [ 0.          0.          0.        ]]

防止 scipy 特征向量因计算机而异

Preventing scipy eigenvectors differing from computer to computer

python

numpy

scipy

eigenvalue

eigenvector