R 条件变异日期列使用 groupby 和填充
R conditional mutate date column using groupby and fill
那里有类似措辞的问题,但 none 显示了我想要做的事情。我在下面有一个数据框示例。我想 group_by ID 并创建一个 Date2 列,其中 rank=2。我很难弄清楚这一点。
ID Rank Date Date2
1 5678 1 2000-01-01 2010-05-02
2 5678 2 2010-05-02 2010-05-02
3 1234 1 2000-01-01 2015-06-03
4 1234 2 2015-06-03 2015-06-03
这是我目前的情况:
df <- df %>% group_by(ID) %>%fill(Date2,.direction='up')
我该怎么做?
试试这个:
library(dplyr)
#Code
df %>% group_by(ID) %>% mutate(Date2=Date[Rank==2])
输出:
# A tibble: 4 x 4
# Groups: ID [2]
ID Rank Date Date2
<int> <int> <chr> <chr>
1 5678 1 2000-01-01 2010-05-02
2 5678 2 2010-05-02 2010-05-02
3 1234 1 2000-01-01 2015-06-03
4 1234 2 2015-06-03 2015-06-03
使用了一些数据:
#Data
df <- structure(list(ID = c(5678L, 5678L, 1234L, 1234L), Rank = c(1L,
2L, 1L, 2L), Date = c("2000-01-01", "2010-05-02", "2000-01-01",
"2015-06-03")), row.names = c("1", "2", "3", "4"), class = "data.frame")
此外,如果您想使用 fill()
,您可以尝试此代码。您必须使用 ifelse()
之类的条件来分配日期,然后填写值:
#Code 2
df %>% group_by(ID) %>%
mutate(Date2=ifelse(Rank==2,Date,NA)) %>%
fill(Date2,.direction = 'up')
输出:
# A tibble: 4 x 4
# Groups: ID [2]
ID Rank Date Date2
<int> <int> <chr> <chr>
1 5678 1 2000-01-01 2010-05-02
2 5678 2 2010-05-02 2010-05-02
3 1234 1 2000-01-01 2015-06-03
4 1234 2 2015-06-03 2015-06-03
我们可以按'ID'分组,用逻辑表达式得到对应的'Date1'
library(dplyr)
df %>%
group_by(ID) %>%
mutate(Date2 = Date[Rank == 2][1])
# A tibble: 4 x 4
# Groups: ID [2]
# ID Rank Date Date2
# <int> <int> <chr> <chr>
#1 5678 1 2000-01-01 2010-05-02
#2 5678 2 2010-05-02 2010-05-02
#3 1234 1 2000-01-01 2015-06-03
#4 1234 2 2015-06-03 2015-06-03
或者另一种选择是使用 match
df %>%
group_by(ID) %>%
mutate(Date2 = Date[match(2, Rank)])
或使用data.table
library(data.table)
setDT(df)[, Date2 := Date[match(2, Rank)], ID]
或 base R
df$Date2 <- with(df, rep(Date[Rank == 2], table(ID)))
数据
df <- structure(list(ID = c(5678L, 5678L, 1234L, 1234L), Rank = c(1L,
2L, 1L, 2L), Date = c("2000-01-01", "2010-05-02", "2000-01-01",
"2015-06-03")), row.names = c("1", "2", "3", "4"), class = "data.frame")
那里有类似措辞的问题,但 none 显示了我想要做的事情。我在下面有一个数据框示例。我想 group_by ID 并创建一个 Date2 列,其中 rank=2。我很难弄清楚这一点。
ID Rank Date Date2
1 5678 1 2000-01-01 2010-05-02
2 5678 2 2010-05-02 2010-05-02
3 1234 1 2000-01-01 2015-06-03
4 1234 2 2015-06-03 2015-06-03
这是我目前的情况:
df <- df %>% group_by(ID) %>%fill(Date2,.direction='up')
我该怎么做?
试试这个:
library(dplyr)
#Code
df %>% group_by(ID) %>% mutate(Date2=Date[Rank==2])
输出:
# A tibble: 4 x 4
# Groups: ID [2]
ID Rank Date Date2
<int> <int> <chr> <chr>
1 5678 1 2000-01-01 2010-05-02
2 5678 2 2010-05-02 2010-05-02
3 1234 1 2000-01-01 2015-06-03
4 1234 2 2015-06-03 2015-06-03
使用了一些数据:
#Data
df <- structure(list(ID = c(5678L, 5678L, 1234L, 1234L), Rank = c(1L,
2L, 1L, 2L), Date = c("2000-01-01", "2010-05-02", "2000-01-01",
"2015-06-03")), row.names = c("1", "2", "3", "4"), class = "data.frame")
此外,如果您想使用 fill()
,您可以尝试此代码。您必须使用 ifelse()
之类的条件来分配日期,然后填写值:
#Code 2
df %>% group_by(ID) %>%
mutate(Date2=ifelse(Rank==2,Date,NA)) %>%
fill(Date2,.direction = 'up')
输出:
# A tibble: 4 x 4
# Groups: ID [2]
ID Rank Date Date2
<int> <int> <chr> <chr>
1 5678 1 2000-01-01 2010-05-02
2 5678 2 2010-05-02 2010-05-02
3 1234 1 2000-01-01 2015-06-03
4 1234 2 2015-06-03 2015-06-03
我们可以按'ID'分组,用逻辑表达式得到对应的'Date1'
library(dplyr)
df %>%
group_by(ID) %>%
mutate(Date2 = Date[Rank == 2][1])
# A tibble: 4 x 4
# Groups: ID [2]
# ID Rank Date Date2
# <int> <int> <chr> <chr>
#1 5678 1 2000-01-01 2010-05-02
#2 5678 2 2010-05-02 2010-05-02
#3 1234 1 2000-01-01 2015-06-03
#4 1234 2 2015-06-03 2015-06-03
或者另一种选择是使用 match
df %>%
group_by(ID) %>%
mutate(Date2 = Date[match(2, Rank)])
或使用data.table
library(data.table)
setDT(df)[, Date2 := Date[match(2, Rank)], ID]
或 base R
df$Date2 <- with(df, rep(Date[Rank == 2], table(ID)))
数据
df <- structure(list(ID = c(5678L, 5678L, 1234L, 1234L), Rank = c(1L,
2L, 1L, 2L), Date = c("2000-01-01", "2010-05-02", "2000-01-01",
"2015-06-03")), row.names = c("1", "2", "3", "4"), class = "data.frame")