为什么我得到相同的图表?
Why did I get the same graph?
我在ipython中使用networkx来分析我的图或网络,当我生成最大生成树和最小生成树时,我得到一个很奇怪的结果,这两个图是一样的!
这是我的代码:
a=nx.maximum_spanning_tree(pearson_net)
b=nx.minimum_spanning_tree(pearson_net)
pearson_net是我原来的网络(图),我想得到这两个图的边,但是这些边完全一样!
a.edges()
这是图a的边:
EdgeView([('600000.SH', '600015.SH'), ('600000.SH', '600016.SH'),
('600000.SH', '600030.SH'), ('600000.SH', '600036.SH'),
('600000.SH','600109.SH'), ('600000.SH', '600816.SH'),
('600000.SH','600837.SH'), ('600000.SH', '600999.SH'),
('600000.SH', '601009.SH'), ('600000.SH', '601099.SH'),
('600000.SH', '601166.SH'), ('600000.SH', '601288.SH'),
('600000.SH', '601318.SH'), ('600000.SH', '601328.SH'),
('600000.SH', '601336.SH'), ('600000.SH', '601377.SH'),
('600000.SH', '601398.SH'), ('600000.SH', '601555.SH'),
('600000.SH', '601601.SH'), ('600000.SH', '601628.SH'),
('600000.SH', '601688.SH'), ('600000.SH', '601788.SH'),
('600000.SH', '601818.SH'), ('600000.SH', '601939.SH'),
('600000.SH', '601988.SH'), ('600000.SH', '601998.SH'),
('600000.SH', '000001.SZ'), ('600000.SH', '000686.SZ'),
('600000.SH', '000728.SZ'), ('600000.SH', '000750.SZ'),
('600000.SH', '000776.SZ'), ('600000.SH', '000783.SZ'),
('600000.SH', '002142.SZ'), ('600000.SH', '002500.SZ'),
('600000.SH', '002673.SZ')])
然后
b.edges()
这些是图 b 的边:
EdgeView([('600000.SH', '600015.SH'), ('600000.SH', '600016.SH'),
('600000.SH', '600030.SH'), ('600000.SH', '600036.SH'),
('600000.SH','600109.SH'), ('600000.SH', '600816.SH'),
('600000.SH','600837.SH'), ('600000.SH', '600999.SH'),
('600000.SH', '601009.SH'), ('600000.SH', '601099.SH'),
('600000.SH', '601166.SH'), ('600000.SH', '601288.SH'),
('600000.SH', '601318.SH'), ('600000.SH', '601328.SH'),
('600000.SH', '601336.SH'), ('600000.SH', '601377.SH'),
('600000.SH', '601398.SH'), ('600000.SH', '601555.SH'),
('600000.SH', '601601.SH'), ('600000.SH', '601628.SH'),
('600000.SH', '601688.SH'), ('600000.SH', '601788.SH'),
('600000.SH', '601818.SH'), ('600000.SH', '601939.SH'),
('600000.SH', '601988.SH'), ('600000.SH', '601998.SH'),
('600000.SH', '000001.SZ'), ('600000.SH', '000686.SZ'),
('600000.SH', '000728.SZ'), ('600000.SH', '000750.SZ'),
('600000.SH', '000776.SZ'), ('600000.SH', '000783.SZ'),
('600000.SH', '002142.SZ'), ('600000.SH', '002500.SZ'),
('600000.SH', '002673.SZ')])
我无法理解result.Why maximum_spanning_tree 与minimum_spanning_tree 相同吗?
这是 pearson_net 的图形:
它是一个完整的图,一个节点可以与任何其他节点相连。
这是下面 pearson_net' 数据集的一部分:
列和索引是图的节点,数字(皮尔逊相关系数)是边的权重。
这是我的完整代码:
pearson_net=nx.Graph()
for i in range(pearson):
for j in range(i+1,pearson):
pearson_net.add_edge(pearson.index[i],pearson.columns[j],......
weights=pearson.iloc[i][j])
tree1=nx.minimum_spanning_tree(pearson_net)
tree2=nx.maximum_spanning_tree(pearson_net)
"pearson"是相关系数的矩阵,也就是之前的数据集
测试最小和最大生成树
我们需要使用一个最小示例来控制 minimum_spanning_tree()
和 maximum_spanning_tree()
函数的结果:
a_mat = [
[1.,0.661435,0.667419,0.547633],
[0.661435,1.,0.676438,0.542115],
[0.667419,0.676438,1.,0.500370],
[0.547633,0.542115,0.500370,1.]
]
G = nx.from_numpy_matrix(np.array(a_mat))
pos = nx.spring_layout(G)
nx.draw_networkx_nodes(G,pos=pos)
nx.draw_networkx_edges(G,pos=pos)
nx.draw_networkx_edge_labels(G, pos=pos)
plt.axis('off')
plt.show()
从这个例子中,我们可以通过添加最低的边权重(0.50037,0.547633,0.542115)轻松找到最小生成树
确实:
mi = nx.minimum_spanning_tree(G)
mi.edges(data=True)
[输出]:
EdgeDataView([(0, 3, {'weight': 0.547633}), (1, 3, {'weight': 0.542115}), (2, 3, {'weight': 0.50037})])
对于最大生成树,我们可以从图中预测最大边权重和(0.661435, 0.667419,0.547633):
ma = nx.maximum_spanning_tree(G)
ma.edges(data=True)
[输出]:
EdgeDataView([(0, 2, {'weight': 0.667419}), (0, 3, {'weight': 0.547633}), (1, 2, {'weight': 0.676438})])
从这个简单的例子中,我们可以观察到这两个函数的行为符合预期。
如果您向我们展示您的代码,我们或许能够为您找出错误。
[编辑] 从 Dataframe 构建图
从您的更新看来,您的皮尔逊矩阵是一个 pandas 数据框。这是从 Dataframe 开始的相同过程。您可以使用 networkx 专用方法 nx.from_pandas_adjacency()
.
import pandas as pd
df = pd.DataFrame(a_mat)
创建图表
pearson_net = nx.from_pandas_adjacency(df)
pos = nx.spring_layout(pearson_net)
nx.draw_networkx_nodes(pearson_net,pos=pos)
nx.draw_networkx_edges(pearson_net,pos=pos)
nx.draw_networkx_edge_labels(pearson_net, pos=pos)
plt.axis('off')
plt.show()
[输出]:
调用生成树方法
tree1=nx.minimum_spanning_tree(pearson_net)
tree2=nx.maximum_spanning_tree(pearson_net)
tree1.edges(data=True)
[输出]:
EdgeDataView([(0, 3, {'weight': 0.547633}), (1, 3, {'weight': 0.542115}), (2, 3, {'weight': 0.50037})])
nx.draw_networkx_nodes(tree1,pos=pos)
nx.draw_networkx_edges(tree1,pos=pos)
nx.draw_networkx_edge_labels(tree1, pos=pos)
plt.axis('off')
plt.show()
[输出]:
tree2.edges(data=True)
[输出]:
EdgeDataView([(0, 2, {'weight': 0.667419}), (0, 3, {'weight': 0.547633}), (1, 2, {'weight': 0.676438})])
nx.draw_networkx_nodes(tree2,pos=pos)
nx.draw_networkx_edges(tree2,pos=pos)
nx.draw_networkx_edge_labels(tree2, pos=pos)
plt.axis('off')
plt.show()
[输出]:
我在ipython中使用networkx来分析我的图或网络,当我生成最大生成树和最小生成树时,我得到一个很奇怪的结果,这两个图是一样的! 这是我的代码:
a=nx.maximum_spanning_tree(pearson_net)
b=nx.minimum_spanning_tree(pearson_net)
pearson_net是我原来的网络(图),我想得到这两个图的边,但是这些边完全一样!
a.edges()
这是图a的边:
EdgeView([('600000.SH', '600015.SH'), ('600000.SH', '600016.SH'),
('600000.SH', '600030.SH'), ('600000.SH', '600036.SH'),
('600000.SH','600109.SH'), ('600000.SH', '600816.SH'),
('600000.SH','600837.SH'), ('600000.SH', '600999.SH'),
('600000.SH', '601009.SH'), ('600000.SH', '601099.SH'),
('600000.SH', '601166.SH'), ('600000.SH', '601288.SH'),
('600000.SH', '601318.SH'), ('600000.SH', '601328.SH'),
('600000.SH', '601336.SH'), ('600000.SH', '601377.SH'),
('600000.SH', '601398.SH'), ('600000.SH', '601555.SH'),
('600000.SH', '601601.SH'), ('600000.SH', '601628.SH'),
('600000.SH', '601688.SH'), ('600000.SH', '601788.SH'),
('600000.SH', '601818.SH'), ('600000.SH', '601939.SH'),
('600000.SH', '601988.SH'), ('600000.SH', '601998.SH'),
('600000.SH', '000001.SZ'), ('600000.SH', '000686.SZ'),
('600000.SH', '000728.SZ'), ('600000.SH', '000750.SZ'),
('600000.SH', '000776.SZ'), ('600000.SH', '000783.SZ'),
('600000.SH', '002142.SZ'), ('600000.SH', '002500.SZ'),
('600000.SH', '002673.SZ')])
然后
b.edges()
这些是图 b 的边:
EdgeView([('600000.SH', '600015.SH'), ('600000.SH', '600016.SH'),
('600000.SH', '600030.SH'), ('600000.SH', '600036.SH'),
('600000.SH','600109.SH'), ('600000.SH', '600816.SH'),
('600000.SH','600837.SH'), ('600000.SH', '600999.SH'),
('600000.SH', '601009.SH'), ('600000.SH', '601099.SH'),
('600000.SH', '601166.SH'), ('600000.SH', '601288.SH'),
('600000.SH', '601318.SH'), ('600000.SH', '601328.SH'),
('600000.SH', '601336.SH'), ('600000.SH', '601377.SH'),
('600000.SH', '601398.SH'), ('600000.SH', '601555.SH'),
('600000.SH', '601601.SH'), ('600000.SH', '601628.SH'),
('600000.SH', '601688.SH'), ('600000.SH', '601788.SH'),
('600000.SH', '601818.SH'), ('600000.SH', '601939.SH'),
('600000.SH', '601988.SH'), ('600000.SH', '601998.SH'),
('600000.SH', '000001.SZ'), ('600000.SH', '000686.SZ'),
('600000.SH', '000728.SZ'), ('600000.SH', '000750.SZ'),
('600000.SH', '000776.SZ'), ('600000.SH', '000783.SZ'),
('600000.SH', '002142.SZ'), ('600000.SH', '002500.SZ'),
('600000.SH', '002673.SZ')])
我无法理解result.Why maximum_spanning_tree 与minimum_spanning_tree 相同吗?
这是 pearson_net 的图形:
它是一个完整的图,一个节点可以与任何其他节点相连。
这是下面 pearson_net' 数据集的一部分:
这是我的完整代码:
pearson_net=nx.Graph()
for i in range(pearson):
for j in range(i+1,pearson):
pearson_net.add_edge(pearson.index[i],pearson.columns[j],......
weights=pearson.iloc[i][j])
tree1=nx.minimum_spanning_tree(pearson_net)
tree2=nx.maximum_spanning_tree(pearson_net)
"pearson"是相关系数的矩阵,也就是之前的数据集
测试最小和最大生成树
我们需要使用一个最小示例来控制 minimum_spanning_tree()
和 maximum_spanning_tree()
函数的结果:
a_mat = [
[1.,0.661435,0.667419,0.547633],
[0.661435,1.,0.676438,0.542115],
[0.667419,0.676438,1.,0.500370],
[0.547633,0.542115,0.500370,1.]
]
G = nx.from_numpy_matrix(np.array(a_mat))
pos = nx.spring_layout(G)
nx.draw_networkx_nodes(G,pos=pos)
nx.draw_networkx_edges(G,pos=pos)
nx.draw_networkx_edge_labels(G, pos=pos)
plt.axis('off')
plt.show()
从这个例子中,我们可以通过添加最低的边权重(0.50037,0.547633,0.542115)轻松找到最小生成树
确实:
mi = nx.minimum_spanning_tree(G)
mi.edges(data=True)
[输出]:
EdgeDataView([(0, 3, {'weight': 0.547633}), (1, 3, {'weight': 0.542115}), (2, 3, {'weight': 0.50037})])
对于最大生成树,我们可以从图中预测最大边权重和(0.661435, 0.667419,0.547633):
ma = nx.maximum_spanning_tree(G)
ma.edges(data=True)
[输出]:
EdgeDataView([(0, 2, {'weight': 0.667419}), (0, 3, {'weight': 0.547633}), (1, 2, {'weight': 0.676438})])
从这个简单的例子中,我们可以观察到这两个函数的行为符合预期。
如果您向我们展示您的代码,我们或许能够为您找出错误。
[编辑] 从 Dataframe 构建图
从您的更新看来,您的皮尔逊矩阵是一个 pandas 数据框。这是从 Dataframe 开始的相同过程。您可以使用 networkx 专用方法 nx.from_pandas_adjacency()
.
import pandas as pd
df = pd.DataFrame(a_mat)
创建图表
pearson_net = nx.from_pandas_adjacency(df)
pos = nx.spring_layout(pearson_net)
nx.draw_networkx_nodes(pearson_net,pos=pos)
nx.draw_networkx_edges(pearson_net,pos=pos)
nx.draw_networkx_edge_labels(pearson_net, pos=pos)
plt.axis('off')
plt.show()
[输出]:
调用生成树方法
tree1=nx.minimum_spanning_tree(pearson_net)
tree2=nx.maximum_spanning_tree(pearson_net)
tree1.edges(data=True)
[输出]:
EdgeDataView([(0, 3, {'weight': 0.547633}), (1, 3, {'weight': 0.542115}), (2, 3, {'weight': 0.50037})])
nx.draw_networkx_nodes(tree1,pos=pos)
nx.draw_networkx_edges(tree1,pos=pos)
nx.draw_networkx_edge_labels(tree1, pos=pos)
plt.axis('off')
plt.show()
[输出]:
tree2.edges(data=True)
[输出]:
EdgeDataView([(0, 2, {'weight': 0.667419}), (0, 3, {'weight': 0.547633}), (1, 2, {'weight': 0.676438})])
nx.draw_networkx_nodes(tree2,pos=pos)
nx.draw_networkx_edges(tree2,pos=pos)
nx.draw_networkx_edge_labels(tree2, pos=pos)
plt.axis('off')
plt.show()
[输出]: