Pandas 从元组 A 和 B 中找出传递关系（两列）

Question

现在你好，我想要的是显示喜欢的层次结构。来自第 1 列的人可以喜欢来自第 2 列的人。基本上最好有 4 列 A、B、C、D，它们显示他们喜欢的每个人以及下一个喜欢的人等。基本上来自 (a, b ) 到 (a, b), (b, c), (c, d) 的元组。我只知道它必须是递归的，但我不知道如何检查 Pandas 不同的列并以递归方式检查它们。所以很多人可以喜欢一个人，但不是每个人都必须喜欢一个人。但如果是这样的话，也只能发生在3个人以上。

所以，我有一个这样的数据框：

import pandas as pd

d = {'col1': ['Ben', 'Mike', 'Carla', 'Maggy', 'Josh', 'Kai', 'Maria', 'Sophie'], 'col2': ['Carla', 'Carla', 'Josh', 'Ben', 'Lena', 'Maggy', 'Mike', 'Chad']}
df = pd.DataFrame(data=d)
df

我想要这样的输出：

d = {'A': ['Ben', 'Mike', 'Carla', 'Maggy', 'Josh', 'Kai', 'Maria', 'Sophie'], 'B': ['Carla', 'Carla', 'Josh', 'Ben', 'Lena', 'Maggy', 'Mike', 'Chad'], 'C': ['Josh', 'Josh', 'Lena', 'NA', 'NA', 'Ben', 'Carla', 'NA'], 'D': ['Lena', 'Lena', 'NA', 'NA', 'NA', 'NA', 'Josh', 'NA']}
df = pd.DataFrame(data=d)
df

我觉得规则是这样的:

某人（B 列）可以被某人（来自 A 列）喜欢，但某人（B 列）不喜欢任何人。（就像查德不喜欢任何人一样）
一个人只能被一个人喜欢(A -> B -> NA -> NA)
有人可以喜欢某人，也就是有人喜欢别人。 (A -> B -> C -> 不适用)
有人可以喜欢一个人，谁又喜欢另一个人。并且有人也喜欢某人。 (A -> B -> C-> D -> NA)

我怎样才能做到这一点？谢谢

Answer 1

您需要的是几个左连接（合并）操作。

这是代码，为清楚起见分为几个步骤：

step1 = pd.merge(df, df, left_on="col2", right_on="col1", how = "left")
step1 = step1[["col1_x", "col2_x", "col2_y"]]
step1.columns = ["first", "second", "third"]

step2 = pd.merge(step1, df, left_on="third", right_on= "col1", how = "left")
res = step2.drop("col1", axis=1).rename(columns={"col2": "fourth"})
print(res)

结果是：

    first second  third fourth
0     Ben  Carla   Josh   Lena
1    Mike  Carla   Josh   Lena
2   Carla   Josh   Lena    NaN
3   Maggy    Ben  Carla   Josh
4    Josh   Lena    NaN    NaN
5     Kai  Maggy    Ben  Carla
6   Maria   Mike  Carla   Josh
7  Sophie   Chad    NaN    NaN

Pandas 从元组 A 和 B 中找出传递关系（两列）

Pandas finding transitive relation from tuples A and B (two columns)

python

algorithm

recursion

relation

pandas