使用 networkX 创建图形
Creating graph using networkX
我正在尝试使用以下代码为此 data 制作图表:
import networkx as nx
import csv
import matplotlib.pyplot as plt
graph = nx.Graph()
filename = "tubedata.csv"
with open(filename) as tube_data:
starting_station = [row[0] for row in csv.reader(tube_data, delimiter=',')]
with open(filename) as tube_data:
ending_station = [row[1] for row in csv.reader(tube_data, delimiter=',')]
with open(filename) as tube_data:
average_time_taken = [row[3] for row in csv.reader(tube_data, delimiter=',')]
with open(filename) as tube_data:
for line in tube_data:
graph.add_edge(starting_station, ending_station, weight=average_time_taken)
但是,我不断收到以下错误:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/tmp/ipykernel_101/53822893.py in <module>
17 with open(filename) as tube_data:
18 for line in tube_data:
---> 19 graph.add_edge(starting_station, ending_station, weight=average_time_taken)
/opt/conda/lib/python3.9/site-packages/networkx/classes/graph.py in add_edge(self, u_of_edge, v_of_edge, **attr)
870 u, v = u_of_edge, v_of_edge
871 # add nodes
--> 872 if u not in self._node:
873 self._adj[u] = self.adjlist_inner_dict_factory()
874 self._node[u] = self.node_attr_dict_factory()
TypeError: unhashable type: 'list'
我已经搜索了错误并了解到我需要传递一个不可变的数据结构。我将代码更改为以下内容:
with open(filename) as tube_data:
starting_station = (row[0] for row in csv.reader(tube_data, delimiter=','))
with open(filename) as tube_data:
ending_station = (row[1] for row in csv.reader(tube_data, delimiter=','))
with open(filename) as tube_data:
average_time_taken = (row[3] for row in csv.reader(tube_data, delimiter=','))
with open(filename) as tube_data:
for line in tube_data:
graph.add_edge(starting_station, ending_station, weight=average_time_taken)
这解决了上述错误,但生成的图只有两个节点和一条边?如何将完整数据捕获为图表?
我将使用以下步骤创建图表:
- 使用
pandas
库将数据读入DataFrame对象
- 根据数据框行
创建边列表[(source, target, weight)]
- 在 networkX 中创建一个空的有向图
- 通过传入边列表向有向图对象添加边
import networkx as nx
import pandas as pd
data = pd.read_csv('tubedata.csv',header=None)
edgelist = data.apply(lambda x: (x[0],x[1],x[3]),axis=1).to_list()
# edgelist
# [('Harrow & Wealdstone', 'Kenton', 3),
# ('Kenton', 'South Kenton', 2),
# ('South Kenton', 'North Wembley', 2),
# ('North Wembley', 'Wembley Central', 2),...
G = nx.DiGraph()
G.add_weighted_edges_from(edgelist)
list(G.edges(data=True))[:5]
# >>>[('Harrow & Wealdstone', 'Kenton', {'weight': 3}),
# ('Kenton', 'South Kenton', {'weight': 2}),
# ('South Kenton', 'North Wembley', {'weight': 2}),
# ('North Wembley', 'Wembley Central', {'weight': 2}),
# ('Wembley Central', 'Stonebridge Park', {'weight': 3})]
在重命名 pandas 数据框列之后,您也可以直接针对 from_pandas_edgelist
see documentation 获得相同的结果:
data = data.rename(columns={0:'source',1:'target',3:'average_time_taken'})
G2 = nx.convert_matrix.from_pandas_edgelist(data, source='source', target='target', edge_attr='average_time_taken', create_using=nx.DiGraph)
list(G2.edges(data=True))[:5]
# [('Harrow & Wealdstone', 'Kenton', {'average_time_taken': 3}),
# ('Kenton', 'South Kenton', {'average_time_taken': 2}),
# ('South Kenton', 'North Wembley', {'average_time_taken': 2}),
# ('North Wembley', 'Wembley Central', {'average_time_taken': 2}),
# ('Wembley Central', 'Stonebridge Park', {'average_time_taken': 3})]
我正在尝试使用以下代码为此 data 制作图表:
import networkx as nx
import csv
import matplotlib.pyplot as plt
graph = nx.Graph()
filename = "tubedata.csv"
with open(filename) as tube_data:
starting_station = [row[0] for row in csv.reader(tube_data, delimiter=',')]
with open(filename) as tube_data:
ending_station = [row[1] for row in csv.reader(tube_data, delimiter=',')]
with open(filename) as tube_data:
average_time_taken = [row[3] for row in csv.reader(tube_data, delimiter=',')]
with open(filename) as tube_data:
for line in tube_data:
graph.add_edge(starting_station, ending_station, weight=average_time_taken)
但是,我不断收到以下错误:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/tmp/ipykernel_101/53822893.py in <module>
17 with open(filename) as tube_data:
18 for line in tube_data:
---> 19 graph.add_edge(starting_station, ending_station, weight=average_time_taken)
/opt/conda/lib/python3.9/site-packages/networkx/classes/graph.py in add_edge(self, u_of_edge, v_of_edge, **attr)
870 u, v = u_of_edge, v_of_edge
871 # add nodes
--> 872 if u not in self._node:
873 self._adj[u] = self.adjlist_inner_dict_factory()
874 self._node[u] = self.node_attr_dict_factory()
TypeError: unhashable type: 'list'
我已经搜索了错误并了解到我需要传递一个不可变的数据结构。我将代码更改为以下内容:
with open(filename) as tube_data:
starting_station = (row[0] for row in csv.reader(tube_data, delimiter=','))
with open(filename) as tube_data:
ending_station = (row[1] for row in csv.reader(tube_data, delimiter=','))
with open(filename) as tube_data:
average_time_taken = (row[3] for row in csv.reader(tube_data, delimiter=','))
with open(filename) as tube_data:
for line in tube_data:
graph.add_edge(starting_station, ending_station, weight=average_time_taken)
这解决了上述错误,但生成的图只有两个节点和一条边?如何将完整数据捕获为图表?
我将使用以下步骤创建图表:
- 使用
pandas
库将数据读入DataFrame对象 - 根据数据框行 创建边列表[(source, target, weight)]
- 在 networkX 中创建一个空的有向图
- 通过传入边列表向有向图对象添加边
import networkx as nx
import pandas as pd
data = pd.read_csv('tubedata.csv',header=None)
edgelist = data.apply(lambda x: (x[0],x[1],x[3]),axis=1).to_list()
# edgelist
# [('Harrow & Wealdstone', 'Kenton', 3),
# ('Kenton', 'South Kenton', 2),
# ('South Kenton', 'North Wembley', 2),
# ('North Wembley', 'Wembley Central', 2),...
G = nx.DiGraph()
G.add_weighted_edges_from(edgelist)
list(G.edges(data=True))[:5]
# >>>[('Harrow & Wealdstone', 'Kenton', {'weight': 3}),
# ('Kenton', 'South Kenton', {'weight': 2}),
# ('South Kenton', 'North Wembley', {'weight': 2}),
# ('North Wembley', 'Wembley Central', {'weight': 2}),
# ('Wembley Central', 'Stonebridge Park', {'weight': 3})]
在重命名 pandas 数据框列之后,您也可以直接针对 from_pandas_edgelist
see documentation 获得相同的结果:
data = data.rename(columns={0:'source',1:'target',3:'average_time_taken'})
G2 = nx.convert_matrix.from_pandas_edgelist(data, source='source', target='target', edge_attr='average_time_taken', create_using=nx.DiGraph)
list(G2.edges(data=True))[:5]
# [('Harrow & Wealdstone', 'Kenton', {'average_time_taken': 3}),
# ('Kenton', 'South Kenton', {'average_time_taken': 2}),
# ('South Kenton', 'North Wembley', {'average_time_taken': 2}),
# ('North Wembley', 'Wembley Central', {'average_time_taken': 2}),
# ('Wembley Central', 'Stonebridge Park', {'average_time_taken': 3})]