如何有效地重新排列 pandas 数据如下？

Question

我需要一些帮助，以在 pandas 中对以下操作进行简明且首先有效的表述：

给定格式的数据框

id    a   b    c   d
1     0   -1   1   1
42    0    1   0   0
128   1   -1   0   1

构建格式的数据框：

id     one_entries
1      "c d"
42     "b"
128    "a d"

也就是说，列 "one_entries" 包含原始框架中的条目为 1 的列的串联名称。

Answer 1

这是使用布尔规则和应用 lambda 函数的一种方法。

In [58]: df
Out[58]:
    id  a  b  c  d
0    1  0 -1  1  1
1   42  0  1  0  0
2  128  1 -1  0  1

In [59]: cols = list('abcd')

In [60]: (df[cols] > 0).apply(lambda x: ' '.join(x[x].index), axis=1)
Out[60]:
0    c d
1      b
2    a d
dtype: object

您可以将结果分配给df['one_entries'] =

应用功能详情

占第一排。

In [83]: x = df[cols].ix[0] > 0

In [84]: x
Out[84]:
a    False
b    False
c     True
d     True
Name: 0, dtype: bool

x 为您提供该行的布尔值，值大于零。 x[x] 只会 return True。本质上是一个列名作为索引的系列。

In [85]: x[x]
Out[85]:
c    True
d    True
Name: 0, dtype: bool

x[x].index 给你列名。

In [86]: x[x].index
Out[86]: Index([u'c', u'd'], dtype='object')

Answer 2

与 John Galt 的推理相同，但更短一些，从字典构造一个新的 DataFrame。

pd.DataFrame({
    'one_entries': (test_df > 0).apply(lambda x: ' '.join(x[x].index), axis=1)
})

#       one_entries
#   1           c d
#  42             b
# 128           a d

如何有效地重新排列 pandas 数据如下？

How to efficiently rearrange pandas data as follows?

python

dataframe

pandas

data-munging