python如何绘制自定义网络的问题
Question on how to draw a customized network in python
我有一个包含以下信息的 pandas 数据框:
Year NodeName NodeSize
1990 A 50
1990 B 10
1990 C 100
1995 A 90
1995 B 70
1995 C 60
2000 A 150
2000 B 90
2000 C 100
2005 A 55
2005 B 90
2005 C 130
我希望将节点放置在列中,这样每年都是一列,每一行都是一个节点名称,节点大小反映了指示的数量。
然后我在数据框中有如下边:
FromYear ToYear FromNode ToNode EdgeWidth
1990 1995 A B 60
1990 1995 A C 20
1990 1995 B A 10
1990 1995 C B 10
1995 2000 A B 60
1995 2000 B A 30
1995 2000 C A 10
1995 2000 C B 10
1995 2000 B C 70
2000 2005 A B 10
2000 2005 A C 60
2000 2005 B A 60
2000 2005 B C 25
2000 2005 C B 44
2000 2005 C A 10
其中第二个数据帧表示边缘信息。例如第一行,它是从1990列下的节点A到1995列下的节点B的箭头,边的宽度与Edgewidth列中的数字成线性关系。
networkx 上好像有很多教程,望指教。
这是我希望它的外观的粗略草图。如果可能的话,每行节点也应该是不同的颜色。我希望它是某种信息图,而不是典型的网络,显示多年来节点之间的流量。
这是生成两个数据帧的代码:
import pandas as pd
nodes = pd.DataFrame(
[(1990,'A',50),
(1990,'B',10),
(1990,'C',100),
(1995,'A',90),
(1995,'B',70),
(1995,'C',60),
(2000,'A',150),
(2000,'B',90),
(2000,'C',100),
(2005,'A',55),
(2005,'B',90),
(2005,'C',130)],
columns=['Year','NodeName','NodeSize'])
edges = pd.DataFrame(
[(1990,1995,'A','B',60),
(1990,1995,'A','C',20),
(1990,1995,'B','A',10),
(1990,1995,'C','B',10),
(1995,2000,'A','B',60),
(1995,2000,'B','A',30),
(1995,2000,'C','A',10),
(1995,2000,'C','B',10),
(1995,2000,'B','C',70),
(2000,2005,'A','B',10),
(2000,2005,'A','C',60),
(2000,2005,'B','A',60),
(2000,2005,'B','C',25),
(2000,2005,'C','B',44),
(2000,2005,'C','A',10)],
columns = ['FromYear','ToYear','FromNode','ToNode','EdgeWidth'])
真的很简单。将 NodeName
s 转换为 y 坐标,将 Year
s 转换为 x 坐标,然后绘制一堆 Circle
和 FancyArrow
补丁。
#!/usr/bin/env python
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from matplotlib.patches import Circle, FancyArrow
nodes = pd.DataFrame(
[(1990,'A',50),
(1990,'B',10),
(1990,'C',100),
(1995,'A',90),
(1995,'B',70),
(1995,'C',60),
(2000,'A',150),
(2000,'B',90),
(2000,'C',100),
(2005,'A',55),
(2005,'B',90),
(2005,'C',130)],
columns=['Year','NodeName','NodeSize'])
edges = pd.DataFrame(
[(1990,1995,'A','B',60),
(1990,1995,'A','C',20),
(1990,1995,'B','A',10),
(1990,1995,'C','B',10),
(1995,2000,'A','B',60),
(1995,2000,'B','A',30),
(1995,2000,'C','A',10),
(1995,2000,'C','B',10),
(1995,2000,'B','C',70),
(2000,2005,'A','B',10),
(2000,2005,'A','C',60),
(2000,2005,'B','A',60),
(2000,2005,'B','C',25),
(2000,2005,'C','B',44),
(2000,2005,'C','A',10)],
columns = ['FromYear','ToYear','FromNode','ToNode','EdgeWidth'])
# compute node coordinates: year -> x, letter -> y;
# np.unique(z, return_inverse=True) maps the unique and alphanumerically
# ordered elements in z to consecutive integers,
# and returns the result as a second output argument
nodes['x'] = np.unique(nodes['Year'], return_inverse=True)[1]
nodes['y'] = np.unique(nodes['NodeName'], return_inverse=True)[1]
# A should be on top, C on bottom
nodes['y'] = np.max(nodes['y']) - nodes['y']
# Year NodeName NodeSize x y
# 0 1990 A 50 0 2
# 1 1990 B 10 0 1
# 2 1990 C 100 0 0
# 3 1995 A 90 1 2
# 4 1995 B 70 1 1
# 5 1995 C 60 1 0
# 6 2000 A 150 2 2
# 7 2000 B 90 2 1
# 8 2000 C 100 2 0
# 9 2005 A 55 3 2
# 10 2005 B 90 3 1
# 11 2005 C 130 3 0
# compute edge paths
edges = pd.merge(edges, nodes, how='inner', left_on=['FromYear', 'FromNode'], right_on=['Year', 'NodeName'])
edges = pd.merge(edges, nodes, how='inner', left_on=['ToYear', 'ToNode'], right_on=['Year', 'NodeName'], suffixes=['_start', '_stop'])
# FromYear ToYear FromNode ToNode EdgeWidth Year_start NodeName_start NodeSize_start x_start y_start Year_stop NodeName_stop NodeSize_stop x_stop y_stop
# 0 1990 1995 A B 60 1990 A 50 0 2 1995 B 70 1 1
# 1 1990 1995 C B 10 1990 C 100 0 0 1995 B 70 1 1
# 2 1990 1995 A C 20 1990 A 50 0 2 1995 C 60 1 0
# 3 1990 1995 B A 10 1990 B 10 0 1 1995 A 90 1 2
# 4 1995 2000 A B 60 1995 A 90 1 2 2000 B 90 2 1
# 5 1995 2000 C B 10 1995 C 60 1 0 2000 B 90 2 1
# 6 1995 2000 B A 30 1995 B 70 1 1 2000 A 150 2 2
# 7 1995 2000 C A 10 1995 C 60 1 0 2000 A 150 2 2
# 8 1995 2000 B C 70 1995 B 70 1 1 2000 C 100 2 0
# 9 2000 2005 A B 10 2000 A 150 2 2 2005 B 90 3 1
# 10 2000 2005 C B 44 2000 C 100 2 0 2005 B 90 3 1
# 11 2000 2005 A C 60 2000 A 150 2 2 2005 C 130 3 0
# 12 2000 2005 B C 25 2000 B 90 2 1 2005 C 130 3 0
# 13 2000 2005 B A 60 2000 B 90 2 1 2005 A 55 3 2
# 14 2000 2005 C A 10 2000 C 100 2 0 2005 A 55 3 2
fig, ax = plt.subplots()
rescale_by = 1./600 # trial and error
# draw edges first
for _, edge in edges.iterrows():
x, y = edge[['x_start', 'y_start']]
dx, dy = edge[['x_stop', 'y_stop']].values - edge[['x_start', 'y_start']].values
ax.add_patch(FancyArrow(x, y, dx, dy, width=rescale_by*edge['EdgeWidth'], length_includes_head=True, color='orange'))
# draw nodes second such that they are plotted on top of edges
for _, node in nodes.iterrows():
ax.add_patch(Circle(node[['x', 'y']], rescale_by*node['NodeSize'], facecolor='w', edgecolor='k'))
ax.text(node['x'], node['y'], node['NodeSize'], ha='center', va='center')
# annotate nodes
for _, node in nodes[['NodeName', 'y']].drop_duplicates().iterrows():
ax.text(-0.5, node['y'], node['NodeName'], fontsize=15, fontweight='bold', ha='center', va='center')
for _, node in nodes[['Year', 'x']].drop_duplicates().iterrows():
ax.text(node['x'], -0.5, node['Year'], fontsize=15, fontweight='bold', ha='center', va='center')
# adjust axis limits to include labels
ax.autoscale_view()
_, xmax = ax.get_xlim()
ax.set_xlim(-1, xmax)
# style axis
ax.set_aspect('equal')
ax.axis('off')
plt.show()
我有一个包含以下信息的 pandas 数据框:
Year NodeName NodeSize
1990 A 50
1990 B 10
1990 C 100
1995 A 90
1995 B 70
1995 C 60
2000 A 150
2000 B 90
2000 C 100
2005 A 55
2005 B 90
2005 C 130
我希望将节点放置在列中,这样每年都是一列,每一行都是一个节点名称,节点大小反映了指示的数量。
然后我在数据框中有如下边:
FromYear ToYear FromNode ToNode EdgeWidth
1990 1995 A B 60
1990 1995 A C 20
1990 1995 B A 10
1990 1995 C B 10
1995 2000 A B 60
1995 2000 B A 30
1995 2000 C A 10
1995 2000 C B 10
1995 2000 B C 70
2000 2005 A B 10
2000 2005 A C 60
2000 2005 B A 60
2000 2005 B C 25
2000 2005 C B 44
2000 2005 C A 10
其中第二个数据帧表示边缘信息。例如第一行,它是从1990列下的节点A到1995列下的节点B的箭头,边的宽度与Edgewidth列中的数字成线性关系。
networkx 上好像有很多教程,望指教。
这是我希望它的外观的粗略草图。如果可能的话,每行节点也应该是不同的颜色。我希望它是某种信息图,而不是典型的网络,显示多年来节点之间的流量。
这是生成两个数据帧的代码:
import pandas as pd
nodes = pd.DataFrame(
[(1990,'A',50),
(1990,'B',10),
(1990,'C',100),
(1995,'A',90),
(1995,'B',70),
(1995,'C',60),
(2000,'A',150),
(2000,'B',90),
(2000,'C',100),
(2005,'A',55),
(2005,'B',90),
(2005,'C',130)],
columns=['Year','NodeName','NodeSize'])
edges = pd.DataFrame(
[(1990,1995,'A','B',60),
(1990,1995,'A','C',20),
(1990,1995,'B','A',10),
(1990,1995,'C','B',10),
(1995,2000,'A','B',60),
(1995,2000,'B','A',30),
(1995,2000,'C','A',10),
(1995,2000,'C','B',10),
(1995,2000,'B','C',70),
(2000,2005,'A','B',10),
(2000,2005,'A','C',60),
(2000,2005,'B','A',60),
(2000,2005,'B','C',25),
(2000,2005,'C','B',44),
(2000,2005,'C','A',10)],
columns = ['FromYear','ToYear','FromNode','ToNode','EdgeWidth'])
真的很简单。将 NodeName
s 转换为 y 坐标,将 Year
s 转换为 x 坐标,然后绘制一堆 Circle
和 FancyArrow
补丁。
#!/usr/bin/env python
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from matplotlib.patches import Circle, FancyArrow
nodes = pd.DataFrame(
[(1990,'A',50),
(1990,'B',10),
(1990,'C',100),
(1995,'A',90),
(1995,'B',70),
(1995,'C',60),
(2000,'A',150),
(2000,'B',90),
(2000,'C',100),
(2005,'A',55),
(2005,'B',90),
(2005,'C',130)],
columns=['Year','NodeName','NodeSize'])
edges = pd.DataFrame(
[(1990,1995,'A','B',60),
(1990,1995,'A','C',20),
(1990,1995,'B','A',10),
(1990,1995,'C','B',10),
(1995,2000,'A','B',60),
(1995,2000,'B','A',30),
(1995,2000,'C','A',10),
(1995,2000,'C','B',10),
(1995,2000,'B','C',70),
(2000,2005,'A','B',10),
(2000,2005,'A','C',60),
(2000,2005,'B','A',60),
(2000,2005,'B','C',25),
(2000,2005,'C','B',44),
(2000,2005,'C','A',10)],
columns = ['FromYear','ToYear','FromNode','ToNode','EdgeWidth'])
# compute node coordinates: year -> x, letter -> y;
# np.unique(z, return_inverse=True) maps the unique and alphanumerically
# ordered elements in z to consecutive integers,
# and returns the result as a second output argument
nodes['x'] = np.unique(nodes['Year'], return_inverse=True)[1]
nodes['y'] = np.unique(nodes['NodeName'], return_inverse=True)[1]
# A should be on top, C on bottom
nodes['y'] = np.max(nodes['y']) - nodes['y']
# Year NodeName NodeSize x y
# 0 1990 A 50 0 2
# 1 1990 B 10 0 1
# 2 1990 C 100 0 0
# 3 1995 A 90 1 2
# 4 1995 B 70 1 1
# 5 1995 C 60 1 0
# 6 2000 A 150 2 2
# 7 2000 B 90 2 1
# 8 2000 C 100 2 0
# 9 2005 A 55 3 2
# 10 2005 B 90 3 1
# 11 2005 C 130 3 0
# compute edge paths
edges = pd.merge(edges, nodes, how='inner', left_on=['FromYear', 'FromNode'], right_on=['Year', 'NodeName'])
edges = pd.merge(edges, nodes, how='inner', left_on=['ToYear', 'ToNode'], right_on=['Year', 'NodeName'], suffixes=['_start', '_stop'])
# FromYear ToYear FromNode ToNode EdgeWidth Year_start NodeName_start NodeSize_start x_start y_start Year_stop NodeName_stop NodeSize_stop x_stop y_stop
# 0 1990 1995 A B 60 1990 A 50 0 2 1995 B 70 1 1
# 1 1990 1995 C B 10 1990 C 100 0 0 1995 B 70 1 1
# 2 1990 1995 A C 20 1990 A 50 0 2 1995 C 60 1 0
# 3 1990 1995 B A 10 1990 B 10 0 1 1995 A 90 1 2
# 4 1995 2000 A B 60 1995 A 90 1 2 2000 B 90 2 1
# 5 1995 2000 C B 10 1995 C 60 1 0 2000 B 90 2 1
# 6 1995 2000 B A 30 1995 B 70 1 1 2000 A 150 2 2
# 7 1995 2000 C A 10 1995 C 60 1 0 2000 A 150 2 2
# 8 1995 2000 B C 70 1995 B 70 1 1 2000 C 100 2 0
# 9 2000 2005 A B 10 2000 A 150 2 2 2005 B 90 3 1
# 10 2000 2005 C B 44 2000 C 100 2 0 2005 B 90 3 1
# 11 2000 2005 A C 60 2000 A 150 2 2 2005 C 130 3 0
# 12 2000 2005 B C 25 2000 B 90 2 1 2005 C 130 3 0
# 13 2000 2005 B A 60 2000 B 90 2 1 2005 A 55 3 2
# 14 2000 2005 C A 10 2000 C 100 2 0 2005 A 55 3 2
fig, ax = plt.subplots()
rescale_by = 1./600 # trial and error
# draw edges first
for _, edge in edges.iterrows():
x, y = edge[['x_start', 'y_start']]
dx, dy = edge[['x_stop', 'y_stop']].values - edge[['x_start', 'y_start']].values
ax.add_patch(FancyArrow(x, y, dx, dy, width=rescale_by*edge['EdgeWidth'], length_includes_head=True, color='orange'))
# draw nodes second such that they are plotted on top of edges
for _, node in nodes.iterrows():
ax.add_patch(Circle(node[['x', 'y']], rescale_by*node['NodeSize'], facecolor='w', edgecolor='k'))
ax.text(node['x'], node['y'], node['NodeSize'], ha='center', va='center')
# annotate nodes
for _, node in nodes[['NodeName', 'y']].drop_duplicates().iterrows():
ax.text(-0.5, node['y'], node['NodeName'], fontsize=15, fontweight='bold', ha='center', va='center')
for _, node in nodes[['Year', 'x']].drop_duplicates().iterrows():
ax.text(node['x'], -0.5, node['Year'], fontsize=15, fontweight='bold', ha='center', va='center')
# adjust axis limits to include labels
ax.autoscale_view()
_, xmax = ax.get_xlim()
ax.set_xlim(-1, xmax)
# style axis
ax.set_aspect('equal')
ax.axis('off')
plt.show()