Pandas:统计某列中的一些值
Pandas: count some values in a column
我有数据框,它是其中的一部分
ID,"url","app_name","used_at","active_seconds","device_connection","device_os","device_type","device_usage"
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-05-01 09:29:11,13,3g,android,smartphone,home
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-05-01 09:33:00,3,unknown,android,smartphone,home
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-06-01 09:33:07,1,unknown,android,smartphone,home
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-06-01 09:34:30,5,unknown,android,smartphone,home
e990fae0f48b7daf52619b5ccbec61bc,"",Messaging,2015-06-01 09:36:22,133,3g,android,smartphone,home
e990fae0f48b7daf52619b5ccbec61bc,"",Messaging,2015-05-02 09:38:40,5,3g,android,smartphone,home
574c4969b017ae6481db9a7c77328bc3,"",Yandex.Navigator,2015-05-01 11:04:48,70,3g,ios,smartphone,home
574c4969b017ae6481db9a7c77328bc3,"",VK Client,2015-6-01 12:02:27,248,3g,ios,smartphone,home
574c4969b017ae6481db9a7c77328bc3,"",Viber,2015-07-01 12:06:35,7,3g,ios,smartphone,home
574c4969b017ae6481db9a7c77328bc3,"",VK Client,2015-08-01 12:23:26,86,3g,ios,smartphone,home
574c4969b017ae6481db9a7c77328bc3,"",Talking Angela,2015-08-02 12:24:52,0,3g,ios,smartphone,home
574c4969b017ae6481db9a7c77328bc3,"",My Talking Angela,2015-08-03 12:24:52,167,3g,ios,smartphone,home
574c4969b017ae6481db9a7c77328bc3,"",Talking Angela,2015-08-04 12:27:39,34,3g,ios,smartphone,home
我需要计算每个月的天数到每个ID
。
如果我尝试 df.groupby('ID')['used_at'].count()
我得到了访问量,我如何在 month
处获取并统计 days
?
我觉得你需要groupby
by ID
, month
and day
and aggregate size
:
df1 = df.used_at.groupby([df['ID'], df.used_at.dt.month,df.used_at.dt.day ]).size()
print (df1)
ID used_at used_at
574c4969b017ae6481db9a7c77328bc3 5 1 1
6 1 1
7 1 1
8 1 1
2 1
3 1
4 1
e990fae0f48b7daf52619b5ccbec61bc 5 1 2
2 1
6 1 3
dtype: int64
或 date
- 它与 year
、month
和 day
相同:
df1 = df.used_at.groupby([df['ID'], df.used_at.dt.date]).size()
print (df1)
ID used_at
574c4969b017ae6481db9a7c77328bc3 2015-05-01 1
2015-06-01 1
2015-07-01 1
2015-08-01 1
2015-08-02 1
2015-08-03 1
2015-08-04 1
e990fae0f48b7daf52619b5ccbec61bc 2015-05-01 2
2015-05-02 1
2015-06-01 3
dtype: int64
count
和 size
的区别:
我有数据框,它是其中的一部分
ID,"url","app_name","used_at","active_seconds","device_connection","device_os","device_type","device_usage"
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-05-01 09:29:11,13,3g,android,smartphone,home
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-05-01 09:33:00,3,unknown,android,smartphone,home
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-06-01 09:33:07,1,unknown,android,smartphone,home
e990fae0f48b7daf52619b5ccbec61bc,"",Phone,2015-06-01 09:34:30,5,unknown,android,smartphone,home
e990fae0f48b7daf52619b5ccbec61bc,"",Messaging,2015-06-01 09:36:22,133,3g,android,smartphone,home
e990fae0f48b7daf52619b5ccbec61bc,"",Messaging,2015-05-02 09:38:40,5,3g,android,smartphone,home
574c4969b017ae6481db9a7c77328bc3,"",Yandex.Navigator,2015-05-01 11:04:48,70,3g,ios,smartphone,home
574c4969b017ae6481db9a7c77328bc3,"",VK Client,2015-6-01 12:02:27,248,3g,ios,smartphone,home
574c4969b017ae6481db9a7c77328bc3,"",Viber,2015-07-01 12:06:35,7,3g,ios,smartphone,home
574c4969b017ae6481db9a7c77328bc3,"",VK Client,2015-08-01 12:23:26,86,3g,ios,smartphone,home
574c4969b017ae6481db9a7c77328bc3,"",Talking Angela,2015-08-02 12:24:52,0,3g,ios,smartphone,home
574c4969b017ae6481db9a7c77328bc3,"",My Talking Angela,2015-08-03 12:24:52,167,3g,ios,smartphone,home
574c4969b017ae6481db9a7c77328bc3,"",Talking Angela,2015-08-04 12:27:39,34,3g,ios,smartphone,home
我需要计算每个月的天数到每个ID
。
如果我尝试 df.groupby('ID')['used_at'].count()
我得到了访问量,我如何在 month
处获取并统计 days
?
我觉得你需要groupby
by ID
, month
and day
and aggregate size
:
df1 = df.used_at.groupby([df['ID'], df.used_at.dt.month,df.used_at.dt.day ]).size()
print (df1)
ID used_at used_at
574c4969b017ae6481db9a7c77328bc3 5 1 1
6 1 1
7 1 1
8 1 1
2 1
3 1
4 1
e990fae0f48b7daf52619b5ccbec61bc 5 1 2
2 1
6 1 3
dtype: int64
或 date
- 它与 year
、month
和 day
相同:
df1 = df.used_at.groupby([df['ID'], df.used_at.dt.date]).size()
print (df1)
ID used_at
574c4969b017ae6481db9a7c77328bc3 2015-05-01 1
2015-06-01 1
2015-07-01 1
2015-08-01 1
2015-08-02 1
2015-08-03 1
2015-08-04 1
e990fae0f48b7daf52619b5ccbec61bc 2015-05-01 2
2015-05-02 1
2015-06-01 3
dtype: int64
count
和 size
的区别: