为节点分配多个属性
Assigning multiple attributes to nodes
我想为我的节点分配一个属性。目前我正在使用以下数据样本创建网络:
Attribute Source Target Weight Label
87.5 Heisenberg Pauli 66.3 1
12.5 Beckham Messi 38.1 0
12.5 Beckham Maradona 12 0
43.5 water melon 33.6 1
标签应给出节点的颜色(1=黄色,0=蓝色)。
网络代码:
G = nx.from_pandas_edgelist(df, source='Source', target='Target', edge_attr='Weight')
collist = df.drop('Weight', axis=1).melt('Label').dropna() # I need this for the below lines of code because I want to draw nodes - their size - based on their degree
degrees=[]
for x in collist['value']:
deg=G.degree[x]
degrees.append(100*deg)
pos=nx.spring_layout(G)
nx.draw_networkx_labels(G, pos, font_size=10)
nx.draw_networkx_nodes(G, pos, nodelist=collist['value'], node_size = degrees, node_color=collist['Label'])
nx.draw_networkx_edges(G, pos)
这段代码应该做的是:节点的大小应该等于它们的度数(这解释了我的代码中的度数和 collist
)。边缘的厚度应等于 Weight
。 Attribute
应该像 link 一样分配(和更新):()。目前,我的代码不包括提到的 link 中的分配,添加和更新如下:
G = nx.Graph()
G.add_node(0, weight=8)
G.add_node(1, weight=5)
G.add_node(2, weight=3)
G.add_node(3, weight=2)
nx.add_path(G, [2,5])
nx.add_path(G, [2,3])
labels = {
n: str(n) + '\nweight=' + str(G.nodes[n]['weight']) if 'weight' in G.nodes[n] else str(n)
for n in G.nodes
}
newWeights = \
[
sum( # summ for averaging
[G.nodes[neighbor]['weight'] for neighbor in G.neighbors(node)] # weight of every neighbor
+ [G.nodes[i]['weight']] # adds the node itsself to the average
) / (len(list(G.neighbors(node)))+1) # average over number of neighbours+1
if len(list(G.neighbors(node))) > 0 # if there are no neighbours
else G.nodes[i]['weight'] # weight stays the same if no neighbours
for i,node in enumerate(G.nodes) # do the above for every node
]
print(newWeights)
for i, node in enumerate(G.nodes):
G.nodes[i]['weight'] = newWeights[i] # writes new weights after it calculated them all.
请注意,我有超过 100 个节点,所以我无法手动完成。
我尝试在我的代码中包含属性,如下所示:
G = nx.from_pandas_edgelist(df_net, source='Source', target='Target', edge_attr=['Weight'])
nx.set_node_attributes(G, pd.Series(nodes.Attribute, index=nodes.node).to_dict(), 'Attribute')
但是,我得到了错误:
----> 1 network(df)
<ipython-input-72-f68985d20046> in network(dataset)
24 degrees=[]
25 for x in collist['value']:
---> 26 deg=G.degree[x]
27 degrees.append(100*deg)
28
~/opt/anaconda3/lib/python3.8/site-packages/networkx/classes/reportviews.py in __getitem__(self, n)
445 def __getitem__(self, n):
446 weight = self._weight
--> 447 nbrs = self._succ[n]
448 if weight is None:
449 return len(nbrs) + (n in nbrs)
KeyError: 87.5
我希望得到的预期输出是一个网络,其中节点在源列中,它们的邻居在目标列中。边缘具有基于权重的厚度。标签给出了源的颜色,而属性值应作为标签添加并更新为 link 上的 question/answer: .
请查看下面我尝试构建的网络类型的可视化示例。图中的属性值是更新前的值(newWeights),这就解释了为什么有些节点有缺失值。 Attribute 仅与 Source 相关,根据 Label 进行着色。边的粗细由Weight给定。
edge_attr
from_pandas_edgelist() 的参数可以接受列表
G = nx.from_pandas_edgelist(df, source='Source', target='Target', edge_attr=['Weight', 'Attribute'])
你 melt
的目的是什么?如果你想看到每个节点的标签,你可以使用
df['Node'] = df['Source'].str.cat(df['Target'], sep=' ').str.split(' ')
df = df.explode('Node')
print(df)
Attribute Source Target Weight Label Node
0 87.5 Heisenberg Pauli 66.3 1 Heisenberg
0 87.5 Heisenberg Pauli 66.3 1 Pauli
1 12.5 Beckham Messi 38.1 1 Beckham
1 12.5 Beckham Messi 38.1 1 Messi
2 23.5 Beckham Maradona 12.0 0 Beckham
2 23.5 Beckham Maradona 12.0 0 Maradona
3 43.5 water melon 33.6 1 water
3 43.5 water melon 33.6 1 melon
但有重复的节点有不同的标签,你需要选择保留哪个。
import networkx as nx
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({"Attribute": [87.5, 12.5, 12.5, 43.5], "Source": ["Heisenberg", "Beckham", "Messi", "water"], "Target" : ["Pauli", "Messi", "Maradona", "melon"], "Weight" : [66.3, 38.1, 12, 33.6], "Label" : [1, 0, 0,1]})
G = nx.from_pandas_edgelist(df, source='Source', target='Target', edge_attr='Weight')
source_attrs = {df.Source[i]: {"Attribute": df.Attribute[i]} for i in range(len(df.Attribute))}
target_attrs = {df.Target[i]: {"Attribute": df.Attribute[i]} for i in range(len(df.Attribute))}
nx.set_node_attributes(G, source_attrs)
nx.set_node_attributes(G, target_attrs)
degrees=[100*G.degree[i] for i in G.nodes()]
weights = [G[u][v]['Weight']/10 for u,v in G.edges()]
colors = []
for node in G.nodes():
if node in source_attrs.keys():
colors.append('yellow')
else:
colors.append('blue')
pos=nx.spring_layout(G)
pos_attrs = {}
for node, coords in pos.items():
pos_attrs[node] = (coords[0], coords[1] + 0.08)
labels = nx.get_node_attributes(G, "Attribute")
custom_node_attrs = {}
for node, attr in labels.items():
custom_node_attrs[node] = str(node) + str(attr)
nx.draw_networkx_labels(G, pos_attrs, labels=custom_node_attrs, font_size=10)
nx.draw_networkx_nodes(G, pos, nodelist=G.nodes(), node_size = degrees, node_color=colors)
nx.draw_networkx_edges(G,pos, width=weights)
plt.show()
根据您给定的示例数据框和所需的输出图像,我创建了以下解决方案:
import pandas as pd
import networkx as nx
import matplotlib.pylab as pl
df = pd.DataFrame(
data=[[87.5, "Heisenberg", "Pauli", 66.3, 1, ],
[12.5, "Beckham", "Messi", 38.1, 0, ],
[12.5, "Beckham", "Maradona", 12, 0, ],
[43.5, "water", "melon", 33.6, 1, ]],
columns=["Attribute", "Source", "Target", "Weight", "Label"]
)
# 1 Creating the graph
G = nx.from_pandas_edgelist(df, source='Source', target='Target', edge_attr='Weight')
# 2 Adding the node attributes for the source nodes
nx.set_node_attributes(G, {node: df.Attribute[i] for i, node in enumerate(df.Source)}, 'Attribute')
nx.set_node_attributes(G, {node: df.Label[i] for i, node in enumerate(df.Source)}, 'Label')
# (optional) checking the created data
print(G.nodes(data=True))
# [('Heisenberg', {'Attribute': 87.5, 'Label': 1}), ('Pauli', {}), ('Beckham', {'Attribute': 12.5, 'Label': 0}), ('Messi', {}), ('Maradona', {}), ('water', {'Attribute': 43.5, 'Label': 1}), ('melon', {})]
print(G.edges(data=True))
# [('Heisenberg', 'Pauli', {'Weight': 66.3}), ('Beckham', 'Messi', {'Weight': 38.1}), ('Beckham', 'Maradona', {'Weight': 12.0}), ('water', 'melon', {'Weight': 33.6})]
# 3 fine tuning the visualisation
degrees = [100 * G.degree[i] for i in G.nodes()]
# not sure what should be the color if no label is available
color_dict = {0: "blue", 1: "yellow", "default": "yellow"}
node_colors = []
labels = {}
for node in G:
label = node + "\n Attribute="
if "Attribute" in G.nodes[node]:
label += str(G.nodes[node]["Attribute"])
labels[node] = label
if "Label" in G.nodes[node]:
node_colors.append(color_dict[G.nodes[node]["Label"]])
else:
node_colors.append(color_dict["default"])
# you can use any other layout e.g spring_layout
pos = nx.circular_layout(G)
nx.draw_networkx(G,
pos,
node_color=node_colors,
node_size=degrees,
width=[edge_info[2]/10 for edge_info in G.edges(data="Weight")],
labels=labels,
)
# 4 Adjustments for node labels partially cut
axis = pl.gca()
# zoom out
# maybe smaller factors work as well, but 1.3 works fine for this minimal example
axis.set_xlim([1.3*x for x in axis.get_xlim()])
axis.set_ylim([1.3*y for y in axis.get_ylim()])
# turn off frame
pl.axis("off")
pl.show()
结果如下
说明
创建网络的主要步骤如下:
- 最简单的部分是使用
nx.from_pandas_edgelist
创建基本网络,其中已经添加了边权重。
- 之后,使用
nx.set_node_attributes
添加节点属性。
- 至此,图就创建完成了,后面的代码都只对图进行操作,例如
G.nodes()
。因此,为了调整可视化只循环 G
.
- 最后,我对创建的 matplotlib 图进行了微调,以避免切割标签。
我想为我的节点分配一个属性。目前我正在使用以下数据样本创建网络:
Attribute Source Target Weight Label
87.5 Heisenberg Pauli 66.3 1
12.5 Beckham Messi 38.1 0
12.5 Beckham Maradona 12 0
43.5 water melon 33.6 1
标签应给出节点的颜色(1=黄色,0=蓝色)。
网络代码:
G = nx.from_pandas_edgelist(df, source='Source', target='Target', edge_attr='Weight')
collist = df.drop('Weight', axis=1).melt('Label').dropna() # I need this for the below lines of code because I want to draw nodes - their size - based on their degree
degrees=[]
for x in collist['value']:
deg=G.degree[x]
degrees.append(100*deg)
pos=nx.spring_layout(G)
nx.draw_networkx_labels(G, pos, font_size=10)
nx.draw_networkx_nodes(G, pos, nodelist=collist['value'], node_size = degrees, node_color=collist['Label'])
nx.draw_networkx_edges(G, pos)
这段代码应该做的是:节点的大小应该等于它们的度数(这解释了我的代码中的度数和 collist
)。边缘的厚度应等于 Weight
。 Attribute
应该像 link 一样分配(和更新):(
G = nx.Graph()
G.add_node(0, weight=8)
G.add_node(1, weight=5)
G.add_node(2, weight=3)
G.add_node(3, weight=2)
nx.add_path(G, [2,5])
nx.add_path(G, [2,3])
labels = {
n: str(n) + '\nweight=' + str(G.nodes[n]['weight']) if 'weight' in G.nodes[n] else str(n)
for n in G.nodes
}
newWeights = \
[
sum( # summ for averaging
[G.nodes[neighbor]['weight'] for neighbor in G.neighbors(node)] # weight of every neighbor
+ [G.nodes[i]['weight']] # adds the node itsself to the average
) / (len(list(G.neighbors(node)))+1) # average over number of neighbours+1
if len(list(G.neighbors(node))) > 0 # if there are no neighbours
else G.nodes[i]['weight'] # weight stays the same if no neighbours
for i,node in enumerate(G.nodes) # do the above for every node
]
print(newWeights)
for i, node in enumerate(G.nodes):
G.nodes[i]['weight'] = newWeights[i] # writes new weights after it calculated them all.
请注意,我有超过 100 个节点,所以我无法手动完成。 我尝试在我的代码中包含属性,如下所示:
G = nx.from_pandas_edgelist(df_net, source='Source', target='Target', edge_attr=['Weight'])
nx.set_node_attributes(G, pd.Series(nodes.Attribute, index=nodes.node).to_dict(), 'Attribute')
但是,我得到了错误:
----> 1 network(df)
<ipython-input-72-f68985d20046> in network(dataset)
24 degrees=[]
25 for x in collist['value']:
---> 26 deg=G.degree[x]
27 degrees.append(100*deg)
28
~/opt/anaconda3/lib/python3.8/site-packages/networkx/classes/reportviews.py in __getitem__(self, n)
445 def __getitem__(self, n):
446 weight = self._weight
--> 447 nbrs = self._succ[n]
448 if weight is None:
449 return len(nbrs) + (n in nbrs)
KeyError: 87.5
我希望得到的预期输出是一个网络,其中节点在源列中,它们的邻居在目标列中。边缘具有基于权重的厚度。标签给出了源的颜色,而属性值应作为标签添加并更新为 link 上的 question/answer:
请查看下面我尝试构建的网络类型的可视化示例。图中的属性值是更新前的值(newWeights),这就解释了为什么有些节点有缺失值。 Attribute 仅与 Source 相关,根据 Label 进行着色。边的粗细由Weight给定。
edge_attr
from_pandas_edgelist() 的参数可以接受列表
G = nx.from_pandas_edgelist(df, source='Source', target='Target', edge_attr=['Weight', 'Attribute'])
你 melt
的目的是什么?如果你想看到每个节点的标签,你可以使用
df['Node'] = df['Source'].str.cat(df['Target'], sep=' ').str.split(' ')
df = df.explode('Node')
print(df)
Attribute Source Target Weight Label Node
0 87.5 Heisenberg Pauli 66.3 1 Heisenberg
0 87.5 Heisenberg Pauli 66.3 1 Pauli
1 12.5 Beckham Messi 38.1 1 Beckham
1 12.5 Beckham Messi 38.1 1 Messi
2 23.5 Beckham Maradona 12.0 0 Beckham
2 23.5 Beckham Maradona 12.0 0 Maradona
3 43.5 water melon 33.6 1 water
3 43.5 water melon 33.6 1 melon
但有重复的节点有不同的标签,你需要选择保留哪个。
import networkx as nx
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({"Attribute": [87.5, 12.5, 12.5, 43.5], "Source": ["Heisenberg", "Beckham", "Messi", "water"], "Target" : ["Pauli", "Messi", "Maradona", "melon"], "Weight" : [66.3, 38.1, 12, 33.6], "Label" : [1, 0, 0,1]})
G = nx.from_pandas_edgelist(df, source='Source', target='Target', edge_attr='Weight')
source_attrs = {df.Source[i]: {"Attribute": df.Attribute[i]} for i in range(len(df.Attribute))}
target_attrs = {df.Target[i]: {"Attribute": df.Attribute[i]} for i in range(len(df.Attribute))}
nx.set_node_attributes(G, source_attrs)
nx.set_node_attributes(G, target_attrs)
degrees=[100*G.degree[i] for i in G.nodes()]
weights = [G[u][v]['Weight']/10 for u,v in G.edges()]
colors = []
for node in G.nodes():
if node in source_attrs.keys():
colors.append('yellow')
else:
colors.append('blue')
pos=nx.spring_layout(G)
pos_attrs = {}
for node, coords in pos.items():
pos_attrs[node] = (coords[0], coords[1] + 0.08)
labels = nx.get_node_attributes(G, "Attribute")
custom_node_attrs = {}
for node, attr in labels.items():
custom_node_attrs[node] = str(node) + str(attr)
nx.draw_networkx_labels(G, pos_attrs, labels=custom_node_attrs, font_size=10)
nx.draw_networkx_nodes(G, pos, nodelist=G.nodes(), node_size = degrees, node_color=colors)
nx.draw_networkx_edges(G,pos, width=weights)
plt.show()
根据您给定的示例数据框和所需的输出图像,我创建了以下解决方案:
import pandas as pd
import networkx as nx
import matplotlib.pylab as pl
df = pd.DataFrame(
data=[[87.5, "Heisenberg", "Pauli", 66.3, 1, ],
[12.5, "Beckham", "Messi", 38.1, 0, ],
[12.5, "Beckham", "Maradona", 12, 0, ],
[43.5, "water", "melon", 33.6, 1, ]],
columns=["Attribute", "Source", "Target", "Weight", "Label"]
)
# 1 Creating the graph
G = nx.from_pandas_edgelist(df, source='Source', target='Target', edge_attr='Weight')
# 2 Adding the node attributes for the source nodes
nx.set_node_attributes(G, {node: df.Attribute[i] for i, node in enumerate(df.Source)}, 'Attribute')
nx.set_node_attributes(G, {node: df.Label[i] for i, node in enumerate(df.Source)}, 'Label')
# (optional) checking the created data
print(G.nodes(data=True))
# [('Heisenberg', {'Attribute': 87.5, 'Label': 1}), ('Pauli', {}), ('Beckham', {'Attribute': 12.5, 'Label': 0}), ('Messi', {}), ('Maradona', {}), ('water', {'Attribute': 43.5, 'Label': 1}), ('melon', {})]
print(G.edges(data=True))
# [('Heisenberg', 'Pauli', {'Weight': 66.3}), ('Beckham', 'Messi', {'Weight': 38.1}), ('Beckham', 'Maradona', {'Weight': 12.0}), ('water', 'melon', {'Weight': 33.6})]
# 3 fine tuning the visualisation
degrees = [100 * G.degree[i] for i in G.nodes()]
# not sure what should be the color if no label is available
color_dict = {0: "blue", 1: "yellow", "default": "yellow"}
node_colors = []
labels = {}
for node in G:
label = node + "\n Attribute="
if "Attribute" in G.nodes[node]:
label += str(G.nodes[node]["Attribute"])
labels[node] = label
if "Label" in G.nodes[node]:
node_colors.append(color_dict[G.nodes[node]["Label"]])
else:
node_colors.append(color_dict["default"])
# you can use any other layout e.g spring_layout
pos = nx.circular_layout(G)
nx.draw_networkx(G,
pos,
node_color=node_colors,
node_size=degrees,
width=[edge_info[2]/10 for edge_info in G.edges(data="Weight")],
labels=labels,
)
# 4 Adjustments for node labels partially cut
axis = pl.gca()
# zoom out
# maybe smaller factors work as well, but 1.3 works fine for this minimal example
axis.set_xlim([1.3*x for x in axis.get_xlim()])
axis.set_ylim([1.3*y for y in axis.get_ylim()])
# turn off frame
pl.axis("off")
pl.show()
结果如下
说明
创建网络的主要步骤如下:
- 最简单的部分是使用
nx.from_pandas_edgelist
创建基本网络,其中已经添加了边权重。 - 之后,使用
nx.set_node_attributes
添加节点属性。 - 至此,图就创建完成了,后面的代码都只对图进行操作,例如
G.nodes()
。因此,为了调整可视化只循环G
. - 最后,我对创建的 matplotlib 图进行了微调,以避免切割标签。