在 R 中计算任期日
Calculating Tenure Day in R
所以我刚才问过一个类似的问题(见)但是我没能得到正确的结果,但现在我已经用另一种可能的方式来问同样的事情,这可能更容易解决。
问题:我想创建一个列,告诉我客户任期的日期。这是一些模拟代码:
Date<-c("01/01/2018", "12/02/2018", "10/03/2018", "22/03/2018", "29/03/2018", "01/04/2018", "02/04/2018","04/04/2018","07/04/2018","11/04/2018", "15/04/2018", "17/04/2018","19/04/2018","21/04/2018","22/04/2018", "29/04/2018", "01/05/2018","03/05/2018","08/05/2018", "10/05/2018", "12/05/2018")
ClientID<-c("aaa","bbb","ccc","ddd", "eee", "fff", "ggg","aaa","bbb","ccc","ddd", "eee", "fff", "ggg","aaa","bbb","ccc","ddd", "eee", "fff", "ggg")
df<-cbind(ClientID, Date)
df<-as.data.frame(df)
df$Date<-dmy(df$Date)
df$yearDay<-df$Date
df$yearDay<-yday(df$yearDay)
给你这样的东西:
df
ClientID Date yearDay
aaa 2018-01-01 1
bbb 2018-02-12 43
ccc 2018-03-10 69
ddd 2018-03-22 81
eee 2018-03-29 88
fff 2018-04-01 91
ggg 2018-04-02 92
aaa 2018-04-04 94
bbb 2018-04-07 97
ccc 2018-04-11 101
ddd 2018-04-15 105
eee 2018-04-17 107
fff 2018-04-19 109
ggg 2018-04-21 111
aaa 2018-04-22 112
bbb 2018-04-29 119
ccc 2018-05-01 121
ddd 2018-05-03 123
eee 2018-05-08 128
fff 2018-05-10 130
ggg 2018-05-12 132
现在我想做的(但不确定如何去做)是在第二个实例中为每个客户端 ID 取年日数,并减去前一个实例中的年日。然后取第三个实例中的 yearDay 数字并减去前一个实例中的 yearDay。依此类推(我有超过四百万行数据)。答案应该留给我任期日。看起来像这样:-
ClientID Date yearDay tenureDay
aaa 2018-01-01 1 1
bbb 2018-02-12 43 1
ccc 2018-03-10 69 1
ddd 2018-03-22 81 1
eee 2018-03-29 88 1
fff 2018-04-01 91 1
ggg 2018-04-02 92 1
aaa 2018-04-04 94 93
bbb 2018-04-07 97 54
ccc 2018-04-11 101 48
ddd 2018-04-15 105 24
eee 2018-04-17 107 19
fff 2018-04-19 109 18
ggg 2018-04-21 111 19
知道我将如何实现这一点吗?
提前致谢!!!
为此,您可以组合使用 dplyr
包中的 mutate()
、arrange()
、lag()
和 group_by()
。
library(dplyr)
df %>%
group_by(ClientID) %>%
arrange(yearDay) %>%
mutate(tenureDay = yearDay - lag(yearDay))
所以我刚才问过一个类似的问题(见
问题:我想创建一个列,告诉我客户任期的日期。这是一些模拟代码:
Date<-c("01/01/2018", "12/02/2018", "10/03/2018", "22/03/2018", "29/03/2018", "01/04/2018", "02/04/2018","04/04/2018","07/04/2018","11/04/2018", "15/04/2018", "17/04/2018","19/04/2018","21/04/2018","22/04/2018", "29/04/2018", "01/05/2018","03/05/2018","08/05/2018", "10/05/2018", "12/05/2018")
ClientID<-c("aaa","bbb","ccc","ddd", "eee", "fff", "ggg","aaa","bbb","ccc","ddd", "eee", "fff", "ggg","aaa","bbb","ccc","ddd", "eee", "fff", "ggg")
df<-cbind(ClientID, Date)
df<-as.data.frame(df)
df$Date<-dmy(df$Date)
df$yearDay<-df$Date
df$yearDay<-yday(df$yearDay)
给你这样的东西:
df
ClientID Date yearDay
aaa 2018-01-01 1
bbb 2018-02-12 43
ccc 2018-03-10 69
ddd 2018-03-22 81
eee 2018-03-29 88
fff 2018-04-01 91
ggg 2018-04-02 92
aaa 2018-04-04 94
bbb 2018-04-07 97
ccc 2018-04-11 101
ddd 2018-04-15 105
eee 2018-04-17 107
fff 2018-04-19 109
ggg 2018-04-21 111
aaa 2018-04-22 112
bbb 2018-04-29 119
ccc 2018-05-01 121
ddd 2018-05-03 123
eee 2018-05-08 128
fff 2018-05-10 130
ggg 2018-05-12 132
现在我想做的(但不确定如何去做)是在第二个实例中为每个客户端 ID 取年日数,并减去前一个实例中的年日。然后取第三个实例中的 yearDay 数字并减去前一个实例中的 yearDay。依此类推(我有超过四百万行数据)。答案应该留给我任期日。看起来像这样:-
ClientID Date yearDay tenureDay
aaa 2018-01-01 1 1
bbb 2018-02-12 43 1
ccc 2018-03-10 69 1
ddd 2018-03-22 81 1
eee 2018-03-29 88 1
fff 2018-04-01 91 1
ggg 2018-04-02 92 1
aaa 2018-04-04 94 93
bbb 2018-04-07 97 54
ccc 2018-04-11 101 48
ddd 2018-04-15 105 24
eee 2018-04-17 107 19
fff 2018-04-19 109 18
ggg 2018-04-21 111 19
知道我将如何实现这一点吗?
提前致谢!!!
为此,您可以组合使用 dplyr
包中的 mutate()
、arrange()
、lag()
和 group_by()
。
library(dplyr)
df %>%
group_by(ClientID) %>%
arrange(yearDay) %>%
mutate(tenureDay = yearDay - lag(yearDay))