有没有办法在 python 中为参与者-组织者建立一个共现(频率)矩阵?
Is there a way to build a co-occurrence (frequency) matrix for participant-organizer in python?
假设我们有一个如下所示的 Dataframe:
df = pd.DataFrame({'participant_id' : [1608, 1608, 2089, 213, 1608, 1887, 2089, 4544, 6866, 2020, 2020],
'organizer_id' : [1772, 1772, 1772, 1790, 1790, 1790, 1791, 1791, 1772, 1799, 1799]})
如果我们打印上面的内容,我们得到:
print(df)
participant_id organizer_id
0 1608 1772
1 1608 1772
2 2089 1772
3 213 1790
4 1608 1790
5 1887 1790
6 2089 1791
7 4544 1791
8 6866 1772
9 2020 1799
10 2020 1799
了解每个参与者以如下所示的共现矩阵形式参与组织者任务的次数将很有价值:
1772 1790 1791 1799
1608 2. 1. 0. 0
2089 1. 0. 1. 0
213 0. 1. 0. 0
1887 0. 1. 0. 0
4544 0. 0. 1. 0
6866 1. 0. 0. 0
2020 0. 0. 0. 2
如何从数据框 df 在 python 中构建这样一个矩阵?
df.groupby(by=["participant_id", "organizer_id"]).size().unstack('organizer_id').fillna(0)
organizer_id 1772 1790 1791 1799
participant_id
213 0.0 1.0 0.0 0.0
1608 2.0 1.0 0.0 0.0
1887 0.0 1.0 0.0 0.0
2020 0.0 0.0 0.0 2.0
2089 1.0 0.0 1.0 0.0
4544 0.0 0.0 1.0 0.0
6866 1.0 0.0 0.0 0.0
这与 How to create co-occurrence matrix from pandas two column?
重复
使用 pd.crosstab(df['participant_id'], df['organizer_id'])
获取输出矩阵。
假设我们有一个如下所示的 Dataframe:
df = pd.DataFrame({'participant_id' : [1608, 1608, 2089, 213, 1608, 1887, 2089, 4544, 6866, 2020, 2020],
'organizer_id' : [1772, 1772, 1772, 1790, 1790, 1790, 1791, 1791, 1772, 1799, 1799]})
如果我们打印上面的内容,我们得到:
print(df)
participant_id organizer_id
0 1608 1772
1 1608 1772
2 2089 1772
3 213 1790
4 1608 1790
5 1887 1790
6 2089 1791
7 4544 1791
8 6866 1772
9 2020 1799
10 2020 1799
了解每个参与者以如下所示的共现矩阵形式参与组织者任务的次数将很有价值:
1772 1790 1791 1799
1608 2. 1. 0. 0
2089 1. 0. 1. 0
213 0. 1. 0. 0
1887 0. 1. 0. 0
4544 0. 0. 1. 0
6866 1. 0. 0. 0
2020 0. 0. 0. 2
如何从数据框 df 在 python 中构建这样一个矩阵?
df.groupby(by=["participant_id", "organizer_id"]).size().unstack('organizer_id').fillna(0)
organizer_id 1772 1790 1791 1799
participant_id
213 0.0 1.0 0.0 0.0
1608 2.0 1.0 0.0 0.0
1887 0.0 1.0 0.0 0.0
2020 0.0 0.0 0.0 2.0
2089 1.0 0.0 1.0 0.0
4544 0.0 0.0 1.0 0.0
6866 1.0 0.0 0.0 0.0
这与 How to create co-occurrence matrix from pandas two column?
重复使用 pd.crosstab(df['participant_id'], df['organizer_id'])
获取输出矩阵。