excel/python 中的 DateDiff
DateDiff in excel/python
我有以下 table,其中每个 Name
我想知道就业的总天数和当前状态(就业或未就业)。
Date Name EmploymentType
01-1-18 A Hired
10-1-18 A Fired
11-1-18 A Hired
15-1-18 A Fired
25-2-18 A Hired
25-2-18 B Hired
05-2-18 C Hired
15-2-18 C Fired
我想要以下结果:
Total Days Employed Name Current Status
15 A Employed
0 B Employed
10 C Not Employed
如果我能知道如何在 Google 张或 python 中做到这一点,那就太好了,两者都很感激。
这不是最优雅的解决方案,但您可以从中发挥作用或稍微了解一下背后的逻辑
import pandas as pd
df = pd.DataFrame({"date":["2018-01-01", "2018-01-10","2018-01-11",
"2018-01-15", "2018-02-25","2018-02-25",
"2018-02-05", "2018-02-15"],
"name":["a"]*5+["b"]+["c"]*2,
"status":['hired', "fired","hired", "fired",
"hired", "hired", "hired", "fired"]})
def fun(x):
x = x.sort_values("date")\
.reset_index(drop=True)
res =[None]*2
# this tell you the last status
res[0] = x["status"].iloc[-1]
# here we count days between any hired and fired
res[1] = x["date"].diff().dt.days.values[1::2].sum()
return(res)
df["date"] = df["date"].astype("M8[us]")
out = df.groupby("name").apply(lambda x: fun(x)).reset_index()
out[["status", "days"]] = out[0].apply(pd.Series)
del out[0]
out
name status days
0 a hired 13.0
1 b hired 0.0
2 c fired 10.0
如果员工仍在工作,我会考虑添加今天的日期。
我有以下 table,其中每个 Name
我想知道就业的总天数和当前状态(就业或未就业)。
Date Name EmploymentType
01-1-18 A Hired
10-1-18 A Fired
11-1-18 A Hired
15-1-18 A Fired
25-2-18 A Hired
25-2-18 B Hired
05-2-18 C Hired
15-2-18 C Fired
我想要以下结果:
Total Days Employed Name Current Status
15 A Employed
0 B Employed
10 C Not Employed
如果我能知道如何在 Google 张或 python 中做到这一点,那就太好了,两者都很感激。
这不是最优雅的解决方案,但您可以从中发挥作用或稍微了解一下背后的逻辑
import pandas as pd
df = pd.DataFrame({"date":["2018-01-01", "2018-01-10","2018-01-11",
"2018-01-15", "2018-02-25","2018-02-25",
"2018-02-05", "2018-02-15"],
"name":["a"]*5+["b"]+["c"]*2,
"status":['hired', "fired","hired", "fired",
"hired", "hired", "hired", "fired"]})
def fun(x):
x = x.sort_values("date")\
.reset_index(drop=True)
res =[None]*2
# this tell you the last status
res[0] = x["status"].iloc[-1]
# here we count days between any hired and fired
res[1] = x["date"].diff().dt.days.values[1::2].sum()
return(res)
df["date"] = df["date"].astype("M8[us]")
out = df.groupby("name").apply(lambda x: fun(x)).reset_index()
out[["status", "days"]] = out[0].apply(pd.Series)
del out[0]
out
name status days
0 a hired 13.0
1 b hired 0.0
2 c fired 10.0
如果员工仍在工作,我会考虑添加今天的日期。