为节点分配多个属性

Assigning multiple attributes to nodes

我想为我的节点分配一个属性。目前我正在使用以下数据样本创建网络:

Attribute   Source       Target Weight  Label
    87.5    Heisenberg   Pauli  66.3    1
    12.5    Beckham      Messi  38.1    0
    12.5    Beckham      Maradona 12    0
    43.5    water        melon  33.6    1

标签应给出节点的颜色(1=黄色,0=蓝色)。

网络代码:

 G = nx.from_pandas_edgelist(df, source='Source', target='Target', edge_attr='Weight') 

    collist = df.drop('Weight', axis=1).melt('Label').dropna() # I need this for the below lines of code because I want to draw nodes - their size - based on their degree

    degrees=[]
    for x in collist['value']:
        deg=G.degree[x]  
        degrees.append(100*deg)

    
    pos=nx.spring_layout(G)

    nx.draw_networkx_labels(G, pos, font_size=10)
    nx.draw_networkx_nodes(G, pos, nodelist=collist['value'], node_size = degrees, node_color=collist['Label'])
    nx.draw_networkx_edges(G, pos)

这段代码应该做的是:节点的大小应该等于它们的度数(这解释了我的代码中的度数和 collist)。边缘的厚度应等于 WeightAttribute 应该像 link 一样分配(和更新):()。目前,我的代码不包括提到的 link 中的分配,添加和更新如下:

G = nx.Graph()
G.add_node(0, weight=8)
G.add_node(1, weight=5)
G.add_node(2, weight=3)
G.add_node(3, weight=2)

nx.add_path(G, [2,5])
nx.add_path(G, [2,3])


labels = {
    n: str(n) + '\nweight=' + str(G.nodes[n]['weight']) if 'weight' in G.nodes[n] else str(n)
    for n in G.nodes
}

newWeights = \
    [
        sum( # summ for averaging
            [G.nodes[neighbor]['weight'] for neighbor in G.neighbors(node)] # weight of every neighbor
            + [G.nodes[i]['weight']] # adds the node itsself to the average
        ) / (len(list(G.neighbors(node)))+1) # average over number of neighbours+1
        if len(list(G.neighbors(node))) > 0 # if there are no neighbours
        else G.nodes[i]['weight'] # weight stays the same if no neighbours
    for i,node in enumerate(G.nodes) # do the above for every node
    ]
print(newWeights) 
for i, node in enumerate(G.nodes):
    G.nodes[i]['weight'] = newWeights[i] # writes new weights after it calculated them all.

请注意,我有超过 100 个节点,所以我无法手动完成。 我尝试在我的代码中包含属性,如下所示:

G = nx.from_pandas_edgelist(df_net, source='Source', target='Target', edge_attr=['Weight'])
    nx.set_node_attributes(G, pd.Series(nodes.Attribute, index=nodes.node).to_dict(), 'Attribute')

但是,我得到了错误:

----> 1 network(df)

<ipython-input-72-f68985d20046> in network(dataset)
     24     degrees=[]
     25     for x in collist['value']:
---> 26         deg=G.degree[x]
     27         degrees.append(100*deg)
     28 

~/opt/anaconda3/lib/python3.8/site-packages/networkx/classes/reportviews.py in __getitem__(self, n)
    445     def __getitem__(self, n):
    446         weight = self._weight
--> 447         nbrs = self._succ[n]
    448         if weight is None:
    449             return len(nbrs) + (n in nbrs)

KeyError: 87.5

我希望得到的预期输出是一个网络,其中节点在源列中,它们的邻居在目标列中。边缘具有基于权重的厚度。标签给出了源的颜色,而属性值应作为标签添加并更新为 link 上的 question/answer: .

请查看下面我尝试构建的网络类型的可视化示例。图中的属性值是更新前的值(newWeights),这就解释了为什么有些节点有缺失值。 Attribute 仅与 Source 相关,根据 Label 进行着色。边的粗细由Weight给定。

edge_attr from_pandas_edgelist() 的参数可以接受列表

G = nx.from_pandas_edgelist(df, source='Source', target='Target', edge_attr=['Weight', 'Attribute']) 

melt 的目的是什么?如果你想看到每个节点的标签,你可以使用

df['Node'] = df['Source'].str.cat(df['Target'], sep=' ').str.split(' ')
df = df.explode('Node')
print(df)

   Attribute      Source    Target  Weight  Label        Node
0       87.5  Heisenberg     Pauli    66.3      1  Heisenberg
0       87.5  Heisenberg     Pauli    66.3      1       Pauli
1       12.5     Beckham     Messi    38.1      1     Beckham
1       12.5     Beckham     Messi    38.1      1       Messi
2       23.5     Beckham  Maradona    12.0      0     Beckham
2       23.5     Beckham  Maradona    12.0      0    Maradona
3       43.5       water     melon    33.6      1       water
3       43.5       water     melon    33.6      1       melon

但有重复的节点有不同的标签,你需要选择保留哪个。

import networkx as nx
import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({"Attribute": [87.5, 12.5, 12.5, 43.5], "Source": ["Heisenberg", "Beckham", "Messi", "water"], "Target" : ["Pauli", "Messi", "Maradona", "melon"], "Weight" : [66.3, 38.1, 12, 33.6], "Label" : [1, 0, 0,1]})

G = nx.from_pandas_edgelist(df, source='Source', target='Target', edge_attr='Weight')

source_attrs = {df.Source[i]: {"Attribute": df.Attribute[i]} for i in range(len(df.Attribute))}

target_attrs = {df.Target[i]: {"Attribute": df.Attribute[i]} for i in range(len(df.Attribute))}



nx.set_node_attributes(G, source_attrs)
nx.set_node_attributes(G, target_attrs)

degrees=[100*G.degree[i] for i in G.nodes()]
weights = [G[u][v]['Weight']/10 for u,v in G.edges()]
colors = []
for node in G.nodes():
    if node in source_attrs.keys():
        colors.append('yellow')
    else:
        colors.append('blue')

pos=nx.spring_layout(G)

pos_attrs = {}
for node, coords in pos.items():
    pos_attrs[node] = (coords[0], coords[1] + 0.08)

labels = nx.get_node_attributes(G, "Attribute")

custom_node_attrs = {}
for node, attr in labels.items():
    custom_node_attrs[node] = str(node) + str(attr)

nx.draw_networkx_labels(G, pos_attrs, labels=custom_node_attrs, font_size=10)

nx.draw_networkx_nodes(G, pos, nodelist=G.nodes(), node_size = degrees, node_color=colors)

nx.draw_networkx_edges(G,pos, width=weights)

plt.show()


根据您给定的示例数据框和所需的输出图像,我创建了以下解决方案:


import pandas as pd
import networkx as nx
import matplotlib.pylab as pl

df = pd.DataFrame(
    data=[[87.5, "Heisenberg", "Pauli", 66.3, 1, ],
          [12.5, "Beckham", "Messi", 38.1, 0, ],
          [12.5, "Beckham", "Maradona", 12, 0, ],
          [43.5, "water", "melon", 33.6, 1, ]],
    columns=["Attribute", "Source", "Target", "Weight", "Label"]
)

# 1 Creating the graph
G = nx.from_pandas_edgelist(df, source='Source', target='Target', edge_attr='Weight')

# 2 Adding the node attributes for the source nodes
nx.set_node_attributes(G, {node: df.Attribute[i] for i, node in enumerate(df.Source)}, 'Attribute')
nx.set_node_attributes(G, {node: df.Label[i] for i, node in enumerate(df.Source)}, 'Label')

# (optional) checking the created data 
print(G.nodes(data=True))
# [('Heisenberg', {'Attribute': 87.5, 'Label': 1}), ('Pauli', {}), ('Beckham', {'Attribute': 12.5, 'Label': 0}), ('Messi', {}), ('Maradona', {}), ('water', {'Attribute': 43.5, 'Label': 1}), ('melon', {})]
print(G.edges(data=True))
# [('Heisenberg', 'Pauli', {'Weight': 66.3}), ('Beckham', 'Messi', {'Weight': 38.1}), ('Beckham', 'Maradona', {'Weight': 12.0}), ('water', 'melon', {'Weight': 33.6})]

# 3 fine tuning the visualisation 
degrees = [100 * G.degree[i] for i in G.nodes()]
# not sure what should be the color if no label is available
color_dict = {0: "blue", 1: "yellow", "default": "yellow"}
node_colors = []
labels = {}
for node in G:
    label = node + "\n Attribute="
    if "Attribute" in G.nodes[node]:
        label += str(G.nodes[node]["Attribute"])
    labels[node] = label

    if "Label" in G.nodes[node]:
        node_colors.append(color_dict[G.nodes[node]["Label"]])
    else:
        node_colors.append(color_dict["default"])

# you can use any other layout e.g spring_layout
pos = nx.circular_layout(G)
nx.draw_networkx(G,
                 pos,
                 node_color=node_colors,
                 node_size=degrees,
                 width=[edge_info[2]/10 for edge_info in G.edges(data="Weight")],
                 labels=labels,
                 )

# 4 Adjustments for node labels partially cut
axis = pl.gca()
# zoom out
# maybe smaller factors work as well, but 1.3 works fine for this minimal example
axis.set_xlim([1.3*x for x in axis.get_xlim()])
axis.set_ylim([1.3*y for y in axis.get_ylim()])
# turn off frame
pl.axis("off")
pl.show()

结果如下

说明

创建网络的主要步骤如下:

  1. 最简单的部分是使用 nx.from_pandas_edgelist 创建基本网络,其中已经添加了边权重。
  2. 之后,使用nx.set_node_attributes添加节点属性。
  3. 至此,图就创建完成了,后面的代码都只对图进行操作,例如G.nodes()。因此,为了调整可视化只循环 G.
  4. 最后,我对创建的 matplotlib 图进行了微调,以避免切割标签。