访问 pandas 系列的索引

Question

我正在尝试确定哪个词在 pandas 数据帧（在我的代码中为 df_temp）中计数最多。我也有这个：

 l = df_temp['word'].count_values()

l 显然是一个 pandas 系列，其中第一行指向 df_temp['word'] 中计数最多的索引（在我的例子中是计数最多的词）。虽然我可以在我的控制台中看到这个词，但我无法正确获取它。到目前为止我发现的唯一方法是将它转换成字典，所以我有：

dl = dict(l)

然后我可以轻松地检索我的索引...在对字典进行排序之后。显然这可以完成工作，但我很确定你有一个更聪明的解决方案，因为这个解决方案非常肮脏和不雅。

Answer 1

value_counts()结果的index是你的值：

l.index

将为您提供计算的值

示例：

In [163]:
df = pd.DataFrame({'a':['hello','world','python','hello','python','python']})
df

Out[163]:
        a
0   hello
1   world
2  python
3   hello
4  python
5  python

In [165]:    
df['a'].value_counts()

Out[165]:
python    3
hello     2
world     1
Name: a, dtype: int64

In [164]:    
df['a'].value_counts().index

Out[164]:
Index(['python', 'hello', 'world'], dtype='object')

所以基本上你可以通过索引系列来获得特定的字数：

In [167]:
l = df['a'].value_counts()
l['hello']

Out[167]:
2

Answer 2

使用 Pandas 您可以在 word 列中找到最频繁的值：

df['word'].value_counts().idxmax()

下面的代码将为您提供该值的计数，即该列中的最大计数：

df['word'].value_counts().max()

访问 pandas 系列的索引

Access the index of a pandas series

python

dictionary

series

pandas