Python: Edge List to an Adjacency Matrix using SciPy/Pandas shows IndexError: column index (3) out of bounds

Question

我有一个带有边缘列表的文本文件 (egde.txt):

1 1 0.00000000000000000000
1 2 0.25790529076045041
1 3 0.77510411846367422
2 1 0.34610027855153203
2 2 0.00000000000000000000
2 3 0.43889275766016713
3 1 0.75335810231494713
3 2 0.22234924264075450
3 3 0.00000000000000000000

如图所示，边的权重是浮动值，分隔符是空格，我必须在文本文件中保持这种方式。我想将此边缘列表转换为如下所示的 Matrix 并将其存储在 CSV 文件中：

    1         2         3
1   0.000000  0.257905  0.775104
2   0.346100  0.000000  0.438893
3   0.753358  0.222349  0.000000

我有以下代码 (txttocsv2.py)，我认为它可以工作，但不幸的是没有：

import numpy as np
import scipy.sparse as sps
import csv
import pandas as pd

with open('connectivity.txt', 'r') as fil:

    A = np.genfromtxt(fil)

    i, j, weight = A[:,0], A[:,1], A[:,2]

    dim =  max(len(set(i)), len(set(j)))

    B = sps.lil_matrix((dim, dim))
    for i,j,w in zip(i,j,weight):
        B[i,j] = w

    for row in B: #I want to print the output as well to see if it works
        print(row)

    with open("connect.csv", "wb") as f:
        for row in B:
            writer = csv.writer(f)
            writer.writerow(B)

错误是：

Traceback (most recent call last):
  File "txttocsv2.py", line 16, in <module>
    B[i,j] = w
  File "/home/osboxes/pymote_env/local/lib/python2.7/site-packages/scipy/sparse/lil.py", line 379, in __setitem__
    i, j, x)
  File "scipy/sparse/_csparsetools.pyx", line 231, in scipy.sparse._csparsetools.lil_fancy_set (scipy/sparse/_csparsetools.c:5041)
  File "scipy/sparse/_csparsetools.pyx", line 376, in scipy.sparse._csparsetools._lil_fancy_set_int32_float64 (scipy/sparse/_csparsetools.c:7021)
  File "scipy/sparse/_csparsetools.pyx", line 87, in scipy.sparse._csparsetools.lil_insert (scipy/sparse/_csparsetools.c:3216)
IndexError: column index (3) out of bounds

任何人都可以指出代码在哪里失败并帮助我吗？
提前致谢:)
使用 Ubuntu 14.04 32 位虚拟机和 Python 2.7

Answer 1

您的代码尝试访问矩阵 B 中的位置 i,j。问题是 i 和 j 是从一开始的，而矩阵是从零开始的。您应该切换到 B[i-1,j-1] = w。此外，您可能需要将行 writer.writerow(B) 更改为 writer.writerow(row)。

或者如 John Galt 所说，使用 pandas pivot:

import pandas as pd

pd.read_csv('edge.txt', delimiter=' ', header=None).pivot(0,1,2).to_csv('connect.csv', header=False, index=False)

Python: Edge List to an Adjacency Matrix using SciPy/Pandas shows IndexError: column index (3) out of bounds

Python: Edge List to an Adjacency Matrix using SciPy/Pandas shows IndexError: column index (3) out of bounds

python

numpy

matrix

scipy

edges