excel/python 中的 DateDiff

DateDiff in excel/python

我有以下 table,其中每个 Name 我想知道就业的总天数和当前状态(就业或未就业)。

Date       Name    EmploymentType
01-1-18    A       Hired
10-1-18    A       Fired
11-1-18    A       Hired
15-1-18    A       Fired
25-2-18    A       Hired
25-2-18    B       Hired
05-2-18    C       Hired
15-2-18    C       Fired

我想要以下结果:

Total Days Employed    Name    Current Status
15                     A       Employed
0                      B       Employed
10                     C       Not Employed

如果我能知道如何在 Google 张或 python 中做到这一点,那就太好了,两者都很感激。

这不是最优雅的解决方案,但您可以从中发挥作用或稍微了解一下背后的逻辑

import pandas as pd
df =  pd.DataFrame({"date":["2018-01-01", "2018-01-10","2018-01-11",
                            "2018-01-15", "2018-02-25","2018-02-25",
                            "2018-02-05", "2018-02-15"],
                    "name":["a"]*5+["b"]+["c"]*2,
                    "status":['hired', "fired","hired", "fired",
                              "hired", "hired", "hired", "fired"]})

def fun(x):
    x = x.sort_values("date")\
         .reset_index(drop=True)
    res =[None]*2
    # this tell you the last status
    res[0] = x["status"].iloc[-1]
    # here we count days between any hired and fired
    res[1] = x["date"].diff().dt.days.values[1::2].sum()
    return(res)


df["date"] =  df["date"].astype("M8[us]")

out = df.groupby("name").apply(lambda x: fun(x)).reset_index()

out[["status", "days"]] = out[0].apply(pd.Series)
del out[0]
out
      name status  days
0    a  hired  13.0
1    b  hired   0.0
2    c  fired  10.0

如果员工仍在工作,我会考虑添加今天的日期。