在Pandas中,如何按列名和条件满足分组,同时将满足条件的单元格加入到单个单元格中

In Pandas, how to group by column name and condition met, while joining the cells that met the condition in a single cell

我什至不知道如何提出这个问题,但这就是我想要完成的:

我有一个 pandas 数据table,其中有数千行,如下所示:

df = pd.read_excel("data.xlsx")
id text value1 value2
1 These are the True False
2 Values of "value1" True False
3 While these others False True
4 are the Values of "value2" False True

如何将满足条件的所有单元格按列名分组,同时将满足条件的单元格加入单个单元格中以获得如下所示的 table?

values merge_text
value1 These are the Values of "value1"
value2 While these others are the Values of "value2"

我在想解决这个问题,首先我需要将table拆分成多个table包含满足单列条件的值,然后合并所有table在一起。

v1 = df[['id', 'text', 'value1']]
v1 = v1[v1["value1"]==True]
id text value1
1 These are the True
2 Values of "value1" True
v2 = df[['id', 'text', 'value2']]
v2 = v2[v2["value2"]==True]
id text value2
3 While these others True
4 are the Values of "value2" True

我不知道,也未能在网上找到答案,是如何像这样合并单元格:

values merge_text
value1 These are the Values of "value1"

您可以 set_index 使用“id”和“text”;然后 stack df。然后(i)自己过滤系列; (ii) groupby“值”和join“文本”:

s = df.set_index(['id','text']).stack()
out = s[s].reset_index(level=1).groupby(level=1)['text'].apply(' '.join).reset_index()

输出:

    index                                           text
0  value1               These are the Values of "value1"
1  value2  While these others are the Values of "value2"