查找具有公共连接的节点

Find nodes with common connections

目前,我已经创建了一个将疾病映射到症状的二分网络图。因此,一种疾病可能与一种或多种症状有关。 另外,我有一些基本的统计数据,比如至少有一种疾病的症状等

import networkx as nx

csv_dictionary = {"Da": ["A", "C"], "Db": ["B"], "Dc": ["A", "C", "F"], "Dd": ["D"], "De": ["E", "B"], "Df":["F"], "Dg":["F"], "Dh":["F"]}

G = nx.Graph()

all_symptoms = set()
for disorder, symptoms in csv_dictionary.items():
    for i in range (0, len(symptoms)):
        G.add_edge(disorder, symptoms[i])

        all_symptoms.add(symptoms[i])

symptoms_with_multiple_diseases = [symptom for symptom in all_symptoms if G.degree(symptom) > 1]

sorted_symptoms = list(sorted(symptoms_with_multiple_diseases, key= lambda symptom: 
G.degree(symptom)))

我需要的是找到具有至少两个症状的疾病。因此,具有两个共同症状的疾病。 我做了一些研究,我认为我应该根据它们的连接方式为我的边缘增加权重,但我无法理解它。

因此,在上面的示例中,Da 和 Dc 共享两个症状(A 和 C)。

您可以迭代 disorder 个中心性高于 2 的节点的长度 2 组合,并找到每个组合的 nx.common_neighbours,只保留至少有 2 个邻居。

因此,也可以从跟踪所有疾病开始:

all_symptoms = set()
all_disorders = set()

for disorder, symptoms in csv_dictionary.items():
    for i in range (0, len(symptoms)):
        G.add_edge(disorder, symptoms[i])
        all_symptoms.add(symptoms[i])
    all_disorders.add(disorder)

检查哪些学位高于2

disorders_with_multiple_diseases = [symptom for symptom in all_disorders 
                                    if G.degree(symptom) > 1]

然后遍历 all_dissorders:

的所有 2 组合
from itertools import combinations

common_symtpoms = dict()
for nodes in combinations(all_disorders, r=2):
    cn = list(nx.common_neighbors(G, *nodes))
    if len(cn)>1:
        common_symtpoms[nodes] = list(cn)

print(common_symtpoms)
# {('Da', 'Dc'): ['A', 'C']}