NetworkX 最大连通分量共享属性
NetworkX largest connected component sharing attributes
我知道 NetworkX 中存在用于计算图形连通分量大小的函数。您可以向节点添加属性。
在 Axelrod 的文化传播模型中,一个有趣的度量是节点共享多个属性的最大连接组件的大小。在 NetworkX 中有没有办法做到这一点?
例如,假设我们有一个通过网络表示的人口。每个节点都有头发颜色和肤色的属性。我怎样才能得到节点的最大组成部分的大小,以便在该子图中每个节点都具有相同的头发和肤色?
谢谢
对于一般的数据分析,最好使用pandas
。使用像 networkx
或 graph-tool
这样的图形库来确定连接的组件,然后将该信息加载到您可以分析的 DataFrame
中。在这种情况下,pandas groupby
和 nunique
(唯一元素的数量)功能将很有用。
这是一个使用 graph-tool
(using this network) 的 self-contained 示例。您还可以通过 networkx
.
计算连通分量
import numpy as np
import pandas as pd
import graph_tool.all as gt
# Download an example graph
# https://networks.skewed.de/net/baseball
g = gt.collection.ns["baseball", 'user-provider']
# Extract the player names
names = g.vertex_properties['name'].get_2d_array([0])[0]
# Extract connected component ID for each node
cc, cc_sizes = gt.label_components(g)
# Load into a DataFrame
players = pd.DataFrame({
'id': np.arange(g.num_vertices()),
'name': names,
'cc': cc.a
})
# Create some random attributes
players['hair'] = np.random.choice(['purple', 'pink'], size=len(players))
players['skin'] = np.random.choice(['green', 'blue'], size=len(players))
# For the sake of this example, manipulate the data so
# that some groups are homogenous with respect to some attributes.
players.loc[players['cc'] == 2, 'hair'] = 'purple'
players.loc[players['cc'] == 2, 'skin'] = 'blue'
players.loc[players['cc'] == 4, 'hair'] = 'pink'
players.loc[players['cc'] == 4, 'skin'] = 'green'
# Now determine how many unique hair and skin colors we have in each group.
group_stats = players.groupby('cc').agg({
'hair': 'nunique',
'skin': ['nunique', 'size']
})
# Simplify the column names
group_stats.columns = ['hair_colors', 'skin_colors', 'player_count']
# Select homogenous groups, i.e. groups for which only 1 unique
# hair color is present and 1 unique skin color is present
homogenous = group_stats.query('hair_colors == 1 and skin_colors == 1')
# Sort from large groups to small groups
homogenous = homogenous.sort_values('player_count', ascending=False)
print(homogenous)
打印以下内容:
hair_colors skin_colors player_count
cc
4 1 1 4
2 1 1 3
我知道 NetworkX 中存在用于计算图形连通分量大小的函数。您可以向节点添加属性。 在 Axelrod 的文化传播模型中,一个有趣的度量是节点共享多个属性的最大连接组件的大小。在 NetworkX 中有没有办法做到这一点? 例如,假设我们有一个通过网络表示的人口。每个节点都有头发颜色和肤色的属性。我怎样才能得到节点的最大组成部分的大小,以便在该子图中每个节点都具有相同的头发和肤色? 谢谢
对于一般的数据分析,最好使用pandas
。使用像 networkx
或 graph-tool
这样的图形库来确定连接的组件,然后将该信息加载到您可以分析的 DataFrame
中。在这种情况下,pandas groupby
和 nunique
(唯一元素的数量)功能将很有用。
这是一个使用 graph-tool
(using this network) 的 self-contained 示例。您还可以通过 networkx
.
import numpy as np
import pandas as pd
import graph_tool.all as gt
# Download an example graph
# https://networks.skewed.de/net/baseball
g = gt.collection.ns["baseball", 'user-provider']
# Extract the player names
names = g.vertex_properties['name'].get_2d_array([0])[0]
# Extract connected component ID for each node
cc, cc_sizes = gt.label_components(g)
# Load into a DataFrame
players = pd.DataFrame({
'id': np.arange(g.num_vertices()),
'name': names,
'cc': cc.a
})
# Create some random attributes
players['hair'] = np.random.choice(['purple', 'pink'], size=len(players))
players['skin'] = np.random.choice(['green', 'blue'], size=len(players))
# For the sake of this example, manipulate the data so
# that some groups are homogenous with respect to some attributes.
players.loc[players['cc'] == 2, 'hair'] = 'purple'
players.loc[players['cc'] == 2, 'skin'] = 'blue'
players.loc[players['cc'] == 4, 'hair'] = 'pink'
players.loc[players['cc'] == 4, 'skin'] = 'green'
# Now determine how many unique hair and skin colors we have in each group.
group_stats = players.groupby('cc').agg({
'hair': 'nunique',
'skin': ['nunique', 'size']
})
# Simplify the column names
group_stats.columns = ['hair_colors', 'skin_colors', 'player_count']
# Select homogenous groups, i.e. groups for which only 1 unique
# hair color is present and 1 unique skin color is present
homogenous = group_stats.query('hair_colors == 1 and skin_colors == 1')
# Sort from large groups to small groups
homogenous = homogenous.sort_values('player_count', ascending=False)
print(homogenous)
打印以下内容:
hair_colors skin_colors player_count
cc
4 1 1 4
2 1 1 3