pandas sort_values 总是按日期升序排序
pandas sort_values is always sorting by date in Ascending
pandas 以下查询未按日期降序排序。即使我使用 ascending=False,它也会按日期升序显示数据。
我什至将对象类型的日期列转换为日期时间,如下所示 -
final_data['date'] = pd.to_datetime(final_data['date'])
all_users_list = final_data.sort_values(by='date', ascending=False).groupby(['user_id','date','content_id'])['user_id','content_id','date'].apply(list)
输出样本:
user_id date content_id
user_10013 2018-02-03 cont_3189_6_12 [user_id, content_id, date]
2018-04-10 cont_2244_16_1 [user_id, content_id, date]
2018-08-13 cont_300_1_1 [user_id, content_id, date]
2018-09-11 cont_2233_3_8 [user_id, content_id, date]
2018-12-04 cont_2597_6_8 [user_id, content_id, date]
2019-02-02 cont_2573_4_15 [user_id, content_id, date]
2019-04-14 cont_4860_7_1 [user_id, content_id, date]
2019-04-29 cont_2270_9_2 [user_id, content_id, date]
2019-11-26 cont_2700_3_11 [user_id, content_id, date]
2019-12-05 cont_2946_6_43 [user_id, content_id, date]
2019-12-07 cont_73_1_2 [user_id, content_id, date]
2020-03-12 cont_2975_3_36 [user_id, content_id, date]
2020-09-17 cont_420 [user_id, content_id, date]
2020-10-17 cont_3036_5_14 [user_id, content_id, date]
2020-11-01 cont_4037_1_31 [user_id, content_id, date]
2021-02-17 cont_761_1_2 [user_id, content_id, date]
2021-05-19 cont_4444_3_21 [user_id, content_id, date]
2021-05-21 cont_2911_14_14 [user_id, content_id, date]
2021-07-24 cont_2227_7_18 [user_id, content_id, date]
2021-08-19 cont_286_17_21 [user_id, content_id, date]
2021-10-07 cont_4148_4_22 [user_id, content_id, date]
user_10034 2019-01-06 cont_160_1_5 [user_id, content_id, date]
2019-03-30 cont_1877_2_6 [user_id, content_id, date]
2019-04-05 cont_4550_1_5 [user_id, content_id, date]
2019-04-26 cont_3352_17_15 [user_id, content_id, date]
2019-05-10 cont_363_1_3 [user_id, content_id, date]
2019-05-11 cont_56_11_13 [user_id, content_id, date]
2019-08-27 cont_4812_2_13 [user_id, content_id, date]
2019-09-12 cont_13_1_3 [user_id, content_id, date]
2019-09-13 cont_4435_7_7 [user_id, content_id, date]
2019-09-22 cont_4453_12_4 [user_id, content_id, date]
2019-10-28 cont_4375_1_23 [user_id, content_id, date]
2019-12-25 cont_3356_6_14 [user_id, content_id, date]
2020-02-16 cont_2853_3_22 [user_id, content_id, date]
2020-04-15 cont_1452_4_22 [user_id, content_id, date]
2020-04-27 cont_3331_5_4 [user_id, content_id, date]
2020-05-06 cont_4857_13_24 [user_id, content_id, date]
2020-05-28 cont_3885_1_4 [user_id, content_id, date]
2020-06-22 cont_4472_1_33 [user_id, content_id, date]
2020-07-03 cont_4082_9_36 [user_id, content_id, date]
2020-08-15 cont_4358_5_20 [user_id, content_id, date]
2020-09-03 cont_4952_1_6 [user_id, content_id, date]
2021-01-13 cont_935_19_4 [user_id, content_id, date]
2021-03-03 cont_1063_1_14 [user_id, content_id, date]
可能是什么问题以及如何解决这个问题?
尝试在 groupby
中使用 sort=False
:
all_users_list = final_data.sort_values(by='date', ascending=False).groupby(['user_id','date','content_id'], sort=False)['user_id','content_id','date'].apply(list)
或在 groupby
之后排序:
all_users_list = final_data.groupby(['user_id','date','content_id'], sort=False)['user_id','content_id','date'].apply(list).sort_index(level="date", ascending=False)
pandas 以下查询未按日期降序排序。即使我使用 ascending=False,它也会按日期升序显示数据。
我什至将对象类型的日期列转换为日期时间,如下所示 -
final_data['date'] = pd.to_datetime(final_data['date'])
all_users_list = final_data.sort_values(by='date', ascending=False).groupby(['user_id','date','content_id'])['user_id','content_id','date'].apply(list)
输出样本:
user_id date content_id
user_10013 2018-02-03 cont_3189_6_12 [user_id, content_id, date]
2018-04-10 cont_2244_16_1 [user_id, content_id, date]
2018-08-13 cont_300_1_1 [user_id, content_id, date]
2018-09-11 cont_2233_3_8 [user_id, content_id, date]
2018-12-04 cont_2597_6_8 [user_id, content_id, date]
2019-02-02 cont_2573_4_15 [user_id, content_id, date]
2019-04-14 cont_4860_7_1 [user_id, content_id, date]
2019-04-29 cont_2270_9_2 [user_id, content_id, date]
2019-11-26 cont_2700_3_11 [user_id, content_id, date]
2019-12-05 cont_2946_6_43 [user_id, content_id, date]
2019-12-07 cont_73_1_2 [user_id, content_id, date]
2020-03-12 cont_2975_3_36 [user_id, content_id, date]
2020-09-17 cont_420 [user_id, content_id, date]
2020-10-17 cont_3036_5_14 [user_id, content_id, date]
2020-11-01 cont_4037_1_31 [user_id, content_id, date]
2021-02-17 cont_761_1_2 [user_id, content_id, date]
2021-05-19 cont_4444_3_21 [user_id, content_id, date]
2021-05-21 cont_2911_14_14 [user_id, content_id, date]
2021-07-24 cont_2227_7_18 [user_id, content_id, date]
2021-08-19 cont_286_17_21 [user_id, content_id, date]
2021-10-07 cont_4148_4_22 [user_id, content_id, date]
user_10034 2019-01-06 cont_160_1_5 [user_id, content_id, date]
2019-03-30 cont_1877_2_6 [user_id, content_id, date]
2019-04-05 cont_4550_1_5 [user_id, content_id, date]
2019-04-26 cont_3352_17_15 [user_id, content_id, date]
2019-05-10 cont_363_1_3 [user_id, content_id, date]
2019-05-11 cont_56_11_13 [user_id, content_id, date]
2019-08-27 cont_4812_2_13 [user_id, content_id, date]
2019-09-12 cont_13_1_3 [user_id, content_id, date]
2019-09-13 cont_4435_7_7 [user_id, content_id, date]
2019-09-22 cont_4453_12_4 [user_id, content_id, date]
2019-10-28 cont_4375_1_23 [user_id, content_id, date]
2019-12-25 cont_3356_6_14 [user_id, content_id, date]
2020-02-16 cont_2853_3_22 [user_id, content_id, date]
2020-04-15 cont_1452_4_22 [user_id, content_id, date]
2020-04-27 cont_3331_5_4 [user_id, content_id, date]
2020-05-06 cont_4857_13_24 [user_id, content_id, date]
2020-05-28 cont_3885_1_4 [user_id, content_id, date]
2020-06-22 cont_4472_1_33 [user_id, content_id, date]
2020-07-03 cont_4082_9_36 [user_id, content_id, date]
2020-08-15 cont_4358_5_20 [user_id, content_id, date]
2020-09-03 cont_4952_1_6 [user_id, content_id, date]
2021-01-13 cont_935_19_4 [user_id, content_id, date]
2021-03-03 cont_1063_1_14 [user_id, content_id, date]
可能是什么问题以及如何解决这个问题?
尝试在 groupby
中使用 sort=False
:
all_users_list = final_data.sort_values(by='date', ascending=False).groupby(['user_id','date','content_id'], sort=False)['user_id','content_id','date'].apply(list)
或在 groupby
之后排序:
all_users_list = final_data.groupby(['user_id','date','content_id'], sort=False)['user_id','content_id','date'].apply(list).sort_index(level="date", ascending=False)