如何使用 Pandas 找到最频繁和最不频繁的计数？

Question

问题：如何找到最频繁和最不频繁的次数？

我想要的输出是：

cast              count
Alan Marriott       100
Jandino Asporaa      78
...
Peter                 1

#1 尝试：

df.groupby(by=['cast','show_id']).count()

输出：

cast          show_id   type title director country date_added release_year rating duration listed_in description 

4Minute       80161826  1 1 0 1 1 1 1 1 1 1
50 Cent       70199239  1 1 1 1 1 1 1 1 1 1
A.J LoCascio  80141858  1 1 1 1 1 1 1 1 1 1

#2 尝试：

df.groupby(cast)[show_id].count()

输出：

NameError: name 'cast' is not defined

#3 尝试：

df.groupby(by='cast')

输出：

<pandas.core.groupby.generic.DataFrameGroupBy object at 0x7f2f3894bcd0>

数据集样本：

import pandas as pd
df = pd.DataFrame({
'show_id':['81145628','80117401','70234439'],
'type':['Movie','Movie','TV Show'],
'title':['Norm of the North: King Sized Adventure',
'Jandino: Whatever it Takes',
'Transformers Prime'],
'director':['Richard Finn, Tim Maltby',NaN,NaN],
'cast':['Alan Marriott, Andrew Toth, Brian Dobson',
'Jandino Asporaat','Peter Cullen, Sumalee Montano, Frank Welker'], 
'country':['United States, India, South Korea, China',
'United Kingdom','United States'], 
'date_added':['September 9, 2019',
'September 9, 2016',
'September 8, 2018'],
'release_year':['2019','2016','2013'],
'rating':['TV-PG','TV-MA','TV-Y7-FV'],
'duration':['90 min','94 min','1 Season'],
'listed_in':['Children & Family Movies, Comedies',
'Stand-Up Comedy','Kids TV'],
'description':['Before planning an awesome wedding for his',
'Jandino Asporaat riffs on the challenges of ra',
'With the help of three human allies, the Autob']})

Answer 1

这应该有效：

df.groupby('cast')['show_id'].count().nlargest()

这将return每个组的计数，按计数降序排列：

cast              count
Alan Marriott       100
Jandino Asporaa      78
...
Peter                 1

如何使用 Pandas 找到最频繁和最不频繁的计数？

How can I find the count of the most frequent and least frequent using Pandas?

python

group-by

pandas

kaggle