csc_matrix 列的就地排序

In-place sorting of csc_matrix columns

我希望能够对 scipy 稀疏矩阵的列进行排序。 scipy 文档相当简洁,我看不到太多关于矩阵修改的内容。在 SO 上我发现了这个 post,但是给出的答案 return 是 list

我要写的代码是

s = rand(4, 4, density=0.25, format='csc')

_,colSize = s.get_shape()    
for j in range(0,colSize):
   s.setcol(j, sorted(s.getcol(j), key=attrgetter('data'), reverse=True))

除了没有 setcol 并且 sortedgetcol 不同 return 类型。

作为我想要获得的示例,如果我有输入

<class 'scipy.sparse.csc.csc_matrix'>
[[ 0.          0.33201655  0.          0.        ]
 [ 0.          0.          0.          0.        ]
 [ 0.          0.81332962  0.          0.50794041]
 [ 0.          0.41478979  0.          0.        ]]

那么我想要的输出是

[[ 0.          0.81332962    0.          0.50794041]
 [ 0.          0.414789790.  0.          0.        ]
 [ 0.          0.332016550.  0.          0.        ]
 [ 0.          0.            0.          0.        ]]

(不一定是 csc 矩阵,我假设这对列操作会更好)

这是一个按降序对列进行排序的简短函数 in-place:

import numpy as np


def sort_csc_cols(m):
    """
    Sort the columns of m in descending order.

    m must be a csc_matrix whose nonzero values are all positive.
    m is modified in-place.
    """
    seq = np.arange(m.shape[0])
    for k in range(m.indptr.size - 1):
        start, end = m.indptr[k:k + 2]
        m.data[start:end][::-1].sort()
        m.indices[start:end] = seq[:end - start]

例如,scsc_matrix:

In [47]: s
Out[47]: 
<8x12 sparse matrix of type '<class 'numpy.int64'>'
    with 19 stored elements in Compressed Sparse Column format>

In [48]: s.A
Out[48]: 
array([[ 0,  2,  0,  0,  7,  0,  0, 48,  0,  0,  0,  0],
       [ 0,  0, 82,  0,  0, 38, 67, 17,  9,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0, 47,  0],
       [ 0,  0,  0,  0,  0,  0, 99,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0, 83,  0,  0,  0,  9],
       [ 0,  0,  0,  0,  0,  0, 85, 94,  0, 55, 68,  0],
       [ 0,  0,  0,  0,  0,  0, 22,  0,  0,  0, 71,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0]])

In [49]: sort_csc_cols(s)

In [50]: s.A
Out[50]: 
array([[ 0,  2, 82,  0,  7, 38, 99, 94,  9, 55, 71,  9],
       [ 0,  0,  0,  0,  0,  0, 85, 83,  0,  0, 68,  0],
       [ 0,  0,  0,  0,  0,  0, 67, 48,  0,  0, 47,  0],
       [ 0,  0,  0,  0,  0,  0, 22, 17,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0]])