使用 networkX 创建图形

Creating graph using networkX

我正在尝试使用以下代码为此 data 制作图表:

import networkx as nx
import csv
import matplotlib.pyplot as plt

graph = nx.Graph()
filename = "tubedata.csv"

with open(filename) as tube_data:
    starting_station = [row[0] for row in csv.reader(tube_data, delimiter=',')]

with open(filename) as tube_data:
    ending_station = [row[1] for row in csv.reader(tube_data, delimiter=',')]
    
with open(filename) as tube_data:
    average_time_taken = [row[3] for row in csv.reader(tube_data, delimiter=',')]
    
with open(filename) as tube_data:
    for line in tube_data:
        graph.add_edge(starting_station, ending_station, weight=average_time_taken)

但是,我不断收到以下错误:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_101/53822893.py in <module>
     17 with open(filename) as tube_data:
     18     for line in tube_data:
---> 19         graph.add_edge(starting_station, ending_station, weight=average_time_taken)

/opt/conda/lib/python3.9/site-packages/networkx/classes/graph.py in add_edge(self, u_of_edge, v_of_edge, **attr)
    870         u, v = u_of_edge, v_of_edge
    871         # add nodes
--> 872         if u not in self._node:
    873             self._adj[u] = self.adjlist_inner_dict_factory()
    874             self._node[u] = self.node_attr_dict_factory()

TypeError: unhashable type: 'list'

我已经搜索了错误并了解到我需要传递一个不可变的数据结构。我将代码更改为以下内容:

with open(filename) as tube_data:
   starting_station = (row[0] for row in csv.reader(tube_data, delimiter=','))

with open(filename) as tube_data:
   ending_station = (row[1] for row in csv.reader(tube_data, delimiter=','))
   
with open(filename) as tube_data:
   average_time_taken = (row[3] for row in csv.reader(tube_data, delimiter=','))
   
with open(filename) as tube_data:
   for line in tube_data:
       graph.add_edge(starting_station, ending_station, weight=average_time_taken)

这解决了上述错误,但生成的图只有两个节点和一条边?如何将完整数据捕获为图表?

我将使用以下步骤创建图表:

  1. 使用pandas库将数据读入DataFrame对象
  2. 根据数据框行
  3. 创建边列表[(source, target, weight)]
  4. 在 networkX 中创建一个空的有向图
  5. 通过传入边列表向有向图对象添加边
import networkx as nx
import pandas as pd

data = pd.read_csv('tubedata.csv',header=None)

edgelist = data.apply(lambda x: (x[0],x[1],x[3]),axis=1).to_list()

# edgelist
# [('Harrow & Wealdstone', 'Kenton', 3),
#  ('Kenton', 'South Kenton', 2),
#  ('South Kenton', 'North Wembley', 2),
#  ('North Wembley', 'Wembley Central', 2),...

G = nx.DiGraph()
G.add_weighted_edges_from(edgelist)

list(G.edges(data=True))[:5]
# >>>[('Harrow & Wealdstone', 'Kenton', {'weight': 3}),
#     ('Kenton', 'South Kenton', {'weight': 2}),
#     ('South Kenton', 'North Wembley', {'weight': 2}),
#     ('North Wembley', 'Wembley Central', {'weight': 2}),
#     ('Wembley Central', 'Stonebridge Park', {'weight': 3})]

在重命名 pandas 数据框列之后,您也可以直接针对 from_pandas_edgelist see documentation 获得相同的结果:

data = data.rename(columns={0:'source',1:'target',3:'average_time_taken'})

G2 = nx.convert_matrix.from_pandas_edgelist(data, source='source', target='target', edge_attr='average_time_taken', create_using=nx.DiGraph)


list(G2.edges(data=True))[:5]
# [('Harrow & Wealdstone', 'Kenton', {'average_time_taken': 3}),
# ('Kenton', 'South Kenton', {'average_time_taken': 2}),
# ('South Kenton', 'North Wembley', {'average_time_taken': 2}),
# ('North Wembley', 'Wembley Central', {'average_time_taken': 2}),
# ('Wembley Central', 'Stonebridge Park', {'average_time_taken': 3})]