寻找终极 parent
Finding ultimate parent
我正在努力寻找最终的 parent 与 Dir pandas。但这项任务有一个特点,图表并不真正适合,或者我根本不知道如何正确使用它。
输入:
Child
Parent
Class
1001
8888
A
1001
1002
D
1001
1002
C
1001
1003
C
1003
6666
G
1002
9999
H
输出:
Child
Ultimate_Parent
Class
Connection
1001
8888
A
Direct
1001
9999
D
Indirect
1001
9999
C
Indirect
1001
6666
C
Indirect
1003
6666
G
Direct
1002
9999
H
Direct
我愿意:
import pandas as pd
import networx as nx
df = pd.DataFrame({'Child': ['1001', '1001', '1001', '1001', '1003', '1004'], 'Parent': ['8888', '1002', '1002', '1003', '6666', '9999'],'Class': ['A','D','C','C','G','H']})
def get_hierarchy (df):
DiG=nx.from_pandas_adgelist (df,'child','parent',create_using=nx.DiGraph())
return pd.DataFrame.from_records([(n1,n2) for n1 in DiG.nodes() for n2 in nx.ancestors(DiG, n1)], columns=['child','Ultimate_parent'])
df=df.toPandas()
df=get_hierarchy(df)
return df
我不知道如何在这里使用 Class 属性,用 D 和 C 类.
显示两次 1001
使用G.predecessors
检测当前Parent
是否是树的根。如果是,则连接为 Direct
否则连接为 Indirect
.
G = nx.from_pandas_edgelist(df, source='Parent', target='Child',
create_using=nx.DiGraph)
roots = [node for node, degree in G.in_degree() if degree == 0]
ultimate_parent = [node if node in roots else list(G.predecessors(node))[0]
for node in df['Parent']]
df['Ultimate_Parent'] = ultimate_parent
df['Connection'] = np.where(df['Parent'] == df['Ultimate_Parent'],
'Direct', 'Indirect')
输出:
>>> df
Child Parent Class Ultimate_Parent Connection
0 1001 8888 A 8888 Direct
1 1001 1002 D 9999 Indirect
2 1001 1002 C 9999 Indirect
3 1001 1003 C 6666 Indirect
4 1003 6666 G 6666 Direct
5 1002 9999 H 9999 Direct
我正在努力寻找最终的 parent 与 Dir pandas。但这项任务有一个特点,图表并不真正适合,或者我根本不知道如何正确使用它。 输入:
Child | Parent | Class |
---|---|---|
1001 | 8888 | A |
1001 | 1002 | D |
1001 | 1002 | C |
1001 | 1003 | C |
1003 | 6666 | G |
1002 | 9999 | H |
输出:
Child | Ultimate_Parent | Class | Connection |
---|---|---|---|
1001 | 8888 | A | Direct |
1001 | 9999 | D | Indirect |
1001 | 9999 | C | Indirect |
1001 | 6666 | C | Indirect |
1003 | 6666 | G | Direct |
1002 | 9999 | H | Direct |
我愿意:
import pandas as pd
import networx as nx
df = pd.DataFrame({'Child': ['1001', '1001', '1001', '1001', '1003', '1004'], 'Parent': ['8888', '1002', '1002', '1003', '6666', '9999'],'Class': ['A','D','C','C','G','H']})
def get_hierarchy (df):
DiG=nx.from_pandas_adgelist (df,'child','parent',create_using=nx.DiGraph())
return pd.DataFrame.from_records([(n1,n2) for n1 in DiG.nodes() for n2 in nx.ancestors(DiG, n1)], columns=['child','Ultimate_parent'])
df=df.toPandas()
df=get_hierarchy(df)
return df
我不知道如何在这里使用 Class 属性,用 D 和 C 类.
显示两次 1001使用G.predecessors
检测当前Parent
是否是树的根。如果是,则连接为 Direct
否则连接为 Indirect
.
G = nx.from_pandas_edgelist(df, source='Parent', target='Child',
create_using=nx.DiGraph)
roots = [node for node, degree in G.in_degree() if degree == 0]
ultimate_parent = [node if node in roots else list(G.predecessors(node))[0]
for node in df['Parent']]
df['Ultimate_Parent'] = ultimate_parent
df['Connection'] = np.where(df['Parent'] == df['Ultimate_Parent'],
'Direct', 'Indirect')
输出:
>>> df
Child Parent Class Ultimate_Parent Connection
0 1001 8888 A 8888 Direct
1 1001 1002 D 9999 Indirect
2 1001 1002 C 9999 Indirect
3 1001 1003 C 6666 Indirect
4 1003 6666 G 6666 Direct
5 1002 9999 H 9999 Direct