在Pandas中，如何按列名和条件满足分组，同时将满足条件的单元格加入到单个单元格中

Question

我什至不知道如何提出这个问题，但这就是我想要完成的：

我有一个 pandas 数据table，其中有数千行，如下所示：

df = pd.read_excel("data.xlsx")

id	text	value1	value2
1	These are the	True	False
2	Values of "value1"	True	False
3	While these others	False	True
4	are the Values of "value2"	False	True

如何将满足条件的所有单元格按列名分组，同时将满足条件的单元格加入单个单元格中以获得如下所示的 table？

values	merge_text
value1	These are the Values of "value1"
value2	While these others are the Values of "value2"

我在想解决这个问题，首先我需要将table拆分成多个table包含满足单列条件的值，然后合并所有table在一起。

v1 = df[['id', 'text', 'value1']]
v1 = v1[v1["value1"]==True]

id	text	value1
1	These are the	True
2	Values of "value1"	True

v2 = df[['id', 'text', 'value2']]
v2 = v2[v2["value2"]==True]

id	text	value2
3	While these others	True
4	are the Values of "value2"	True

我不知道，也未能在网上找到答案，是如何像这样合并单元格：

values	merge_text
value1	These are the Values of "value1"

Answer 1

您可以 set_index 使用“id”和“text”；然后 stack df。然后（i）自己过滤系列； (ii) groupby“值”和join“文本”：

s = df.set_index(['id','text']).stack()
out = s[s].reset_index(level=1).groupby(level=1)['text'].apply(' '.join).reset_index()

输出：

    index                                           text
0  value1               These are the Values of "value1"
1  value2  While these others are the Values of "value2"

在Pandas中，如何按列名和条件满足分组，同时将满足条件的单元格加入到单个单元格中

In Pandas, how to group by column name and condition met, while joining the cells that met the condition in a single cell

python

dataframe

pandas

pandas-groupby