如何在稀疏矩阵上执行 <= 和 >=？

Question

是否可以对 Scipy 稀疏矩阵执行 <= 或 >= 运算，使得表达式 returns True 如果运算对所有对应元素都为真？例如，a <= b 表示对于矩阵（A，B）中的所有对应元素（a，b），a <= b？这是一个要考虑的例子：

import numpy as np
from scipy.sparse import csr_matrix

np.random.seed(0)
mat = csr_matrix(np.random.rand(10, 12)>0.7, dtype=int)
print(mat.A)
print()

np.random.seed(1)
matb = csr_matrix(np.random.rand(10, 12)>0.7, dtype=int)
print(matb.A)

运行这会给出警告：SparseEfficiencyWarning: Comparing sparse matrices using >= and <= is inefficient, using <, >, or !=, instead 并给出错误：ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all().

我希望能够采用 2 个稀疏矩阵 A 和 B，并确定 (A, B) 中每对对应元素 (a, b) 是否 A <= B。这可能吗？这样的操作性能如何？

Answer 1

In [402]: np.random.seed = 0
     ...: mat = sparse.csr_matrix(np.random.rand(10, 12)>0.7, dtype=int)
In [403]: mat
Out[403]: 
<10x12 sparse matrix of type '<class 'numpy.int64'>'
    with 40 stored elements in Compressed Sparse Row format>
In [404]: mat.A
Out[404]: 
array([[1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0],
       [1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0],
       ...
       [0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1],
       [0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1]], dtype=int64)
In [405]: np.random.seed = 1
     ...: matb = sparse.csr_matrix(np.random.rand(10, 12)>0.7, dtype=int)

In [407]: mat<matb
Out[407]: 
<10x12 sparse matrix of type '<class 'numpy.bool_'>'
    with 27 stored elements in Compressed Sparse Row format>
In [408]: mat>=matb
/home/paul/.local/lib/python3.6/site-packages/scipy/sparse/compressed.py:295: SparseEfficiencyWarning: Comparing sparse matrices using >= and <= is inefficient, using <, >, or !=, instead.
  "using <, >, or !=, instead.", SparseEfficiencyWarning)
Out[408]: 
<10x12 sparse matrix of type '<class 'numpy.float64'>'
    with 93 stored elements in Compressed Sparse Row format>

在您的情况下，mat 或 matb 都不是特别稀疏，在可能的 120 个中有 40 个和 36 个非零值。即使如此 mat<matb 也会产生 27 个非零值（真）值，而 >= 测试结果为 93。如果两个矩阵都为 0，则结果为真。

它警告我们，如果我们进行此类测试，使用稀疏矩阵不会为我们节省 space 或时间（与密集数组相比）。它不会杀死我们，只是效率不高。

Answer 2

（为这个答案收集一些评论）：

要简单地对两个稀疏矩阵 A 和 B 执行逐元素 <=，您可以执行 (A <= B)。然而，正如@hpaulj 指出的那样，这是低效的，因为任何一对对应的 0 元素（即 (1,1) 在 A 和 B 中都是 0）将通过此操作变成 1。假设 A 和 B 都是稀疏的（大部分为 0），您将通过使它们大部分为 1 来破坏它们的稀疏性。

要解决此问题，请考虑以下事项：

A = csr_matrix((3, 3))
A[1, 1] = 1
print(A.A)
print()

B = csr_matrix((3, 3))
B[0, 0] = 1
B[1, 1] = 2
print(B.A)

print(not (A > B).count_nonzero())

解释一下最后一行，A > B会和A <= B做相反的事情，所以对应的0仍然是0，而a > b的任何地方都会变成1。因此，如果结果矩阵有任何非零元素，则意味着 (A, B) 中有一些 (a, b)，其中 a > b。这意味着 A <= B（按元素）是 而不是 的情况。

如何在稀疏矩阵上执行 <= 和 >=？

How to do <= and >= on sparse matrices?

python

numpy

scipy

sparse-matrix