Return pandas 日期时间系列，按原始系列索引的时间顺序排列

Question

我编译了一个pandas系列的日期时间如下（下图以部分系列为例）：

0   2002-02-03
1   1979-01-01
2   2006-12-25
3   2008-07-16
4   2005-05-30

注：每个cell的dtype为'pandas._libs.tslib.Timestamp'

对于上面的示例，我想按时间顺序对它们进行排名，return 按原始系列的索引对系列进行排名，如下所示（第二列）：

我尝试过混合使用 .order()、.sort() 和 .index() 来实现这一点，但到目前为止无济于事。按照原始系列的索引按时间顺序获取一系列日期时间的最简单方法是什么？

谢谢。

Answer 1

您可以使用 Series.rank，减去 1 并转换为 int:

a = df['date'].rank(method='dense').sub(1).astype(int)
print (a)
0    1
1    0
2    3
3    4
4    2
Name: date, dtype: int32

Series.rank 中的参数 method:

method : {'average', 'min', 'max', 'first', 'dense'}

average: average rank of group
min: lowest rank in group
max: highest rank in group
first: ranks assigned in order they appear in the array
dense: like ‘min’, but rank always increases by 1 between groups

Answer 2

尝试将日期时间序列从 tslib.Timestamp 更改为 to_datetime() 或 to_pydatetime()。
为 original_index 创建一列 (dfl['org_ind'] = np.arange(1:len(df)) 然后做 - df.sort_values(by='foo', ascending=True)

您将按时间顺序获得日期，original_index...

Return pandas 日期时间系列，按原始系列索引的时间顺序排列

Return a pandas series of date time in chronological order by the original series' indices

python

datetime

series

indices

pandas