Pandas:统计某列中的一些值

Pandas: count some values in a column

我有数据框,它是其中的一部分

    ID,"url","app_name","used_at","active_seconds","device_connection","device_os","device_type","device_usage"     
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-05-01 09:29:11,13,3g,android,smartphone,home     
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-05-01 09:33:00,3,unknown,android,smartphone,home     
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-06-01 09:33:07,1,unknown,android,smartphone,home     
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-06-01 09:34:30,5,unknown,android,smartphone,home     
e990fae0f48b7daf52619b5ccbec61bc,"",Messaging,2015-06-01 09:36:22,133,3g,android,smartphone,home        
e990fae0f48b7daf52619b5ccbec61bc,"",Messaging,2015-05-02 09:38:40,5,3g,android,smartphone,home      
574c4969b017ae6481db9a7c77328bc3,"",Yandex.Navigator,2015-05-01 11:04:48,70,3g,ios,smartphone,home      
574c4969b017ae6481db9a7c77328bc3,"",VK Client,2015-6-01 12:02:27,248,3g,ios,smartphone,home     
574c4969b017ae6481db9a7c77328bc3,"",Viber,2015-07-01 12:06:35,7,3g,ios,smartphone,home      
574c4969b017ae6481db9a7c77328bc3,"",VK Client,2015-08-01 12:23:26,86,3g,ios,smartphone,home     
574c4969b017ae6481db9a7c77328bc3,"",Talking Angela,2015-08-02 12:24:52,0,3g,ios,smartphone,home     
574c4969b017ae6481db9a7c77328bc3,"",My Talking Angela,2015-08-03 12:24:52,167,3g,ios,smartphone,home        
574c4969b017ae6481db9a7c77328bc3,"",Talking Angela,2015-08-04 12:27:39,34,3g,ios,smartphone,home        

我需要计算每个月的天数到每个ID

如果我尝试 df.groupby('ID')['used_at'].count() 我得到了访问量,我如何在 month 处获取并统计 days

我觉得你需要groupby by ID, month and day and aggregate size:

df1 = df.used_at.groupby([df['ID'], df.used_at.dt.month,df.used_at.dt.day ]).size()

print (df1)
ID                                used_at  used_at
574c4969b017ae6481db9a7c77328bc3  5        1          1
                                  6        1          1
                                  7        1          1
                                  8        1          1
                                           2          1
                                           3          1
                                           4          1
e990fae0f48b7daf52619b5ccbec61bc  5        1          2
                                           2          1
                                  6        1          3
dtype: int64

date - 它与 yearmonthday 相同:

df1 = df.used_at.groupby([df['ID'], df.used_at.dt.date]).size()

print (df1)
ID                                used_at   
574c4969b017ae6481db9a7c77328bc3  2015-05-01    1
                                  2015-06-01    1
                                  2015-07-01    1
                                  2015-08-01    1
                                  2015-08-02    1
                                  2015-08-03    1
                                  2015-08-04    1
e990fae0f48b7daf52619b5ccbec61bc  2015-05-01    2
                                  2015-05-02    1
                                  2015-06-01    3
dtype: int64

countsize 的区别:

size counts NaN values, count does not.