根据 R 中多行的条件进行变异
Mutate according to conditions on multiple rows in R
如何评估 dplyr
中多行的条件?
我有一个数据集,我想根据多个时间段内发生的条件(转换)进行变异。
按照下面的例子,如果一个人通过了一个不好的状态,它必须被视为整体不好。我试过 mutate_if
但它不起作用,或者我可能无法理解语法
df <-
data.frame(ID = c(1,1,1,2,2,2,3,3,3),
Date= c(1,2,3,1,2,3,1,2,3),
Money = c(500,400,500,100,100,100,200,300,300),
Status = c("Good", "Bad", "Good", "Good","Good","Good", "Bad","Good","Good"))
你能给我提供一个解决方案来达到以下结果吗?如果可能的话,我更愿意呆在 dplyr
的范围内,尽管我知道 datatable
可以进行一些很好的治疗
result <-
data.frame(ID = c(1,1,1,2,2,2,3,3,3),
Date= c(1,2,3,1,2,3,1,2,3),
Money = c(500,400,500,100,100,100,200,300,300),
Status = c("Good", "Bad", "Good", "Good","Good","Good", "Bad","Good","Good"),
Status_overall = c("Bad", "Bad", "Bad", "Good","Good","Good", "Bad","Bad","Bad"))
你可以 return 'Bad'
如果 any
Status
是 'Bad'
在 ID
.
library(dplyr)
df %>%
group_by(ID) %>%
mutate(Status_overall = if(any(Status == 'Bad')) 'Bad' else 'Good')
#Without if/else
#mutate(Status_overall = c('Good', 'Bad')[any(Status == 'Bad') + 1])
# ID Date Money Status Status_overall
# <dbl> <dbl> <dbl> <chr> <chr>
#1 1 1 500 Good Bad
#2 1 2 400 Bad Bad
#3 1 3 500 Good Bad
#4 2 1 100 Good Good
#5 2 2 100 Good Good
#6 2 3 100 Good Good
#7 3 1 200 Bad Bad
#8 3 2 300 Good Bad
#9 3 3 300 Good Bad
这可以用基数 R 和 data.table
写成:
df$Status_overall <- with(df, ifelse(ave(Status == 'Bad', ID, FUN = any), 'Bad', 'Good'))
library(data.table)
setDT(df)[, Status_overall := if(any(Status == 'Bad')) 'Bad' else 'Good', ID]
这个有用吗:
library(dplyr)
df %>% group_by(ID) %>%
mutate(Status_overall = case_when('Bad' %in% Status ~ 'Bad', TRUE ~ 'Good'))
# A tibble: 9 x 5
# Groups: ID [3]
ID Date Money Status Status_overall
<dbl> <dbl> <dbl> <chr> <chr>
1 1 1 500 Good Bad
2 1 2 400 Bad Bad
3 1 3 500 Good Bad
4 2 1 100 Good Good
5 2 2 100 Good Good
6 2 3 100 Good Good
7 3 1 200 Bad Bad
8 3 2 300 Good Bad
9 3 3 300 Good Bad
我们可以使用if
和 %in%
library(dplyr)
df %>%
group_by(ID) %>%
mutate(Status_overall = if('Bad' %in% Status) 'Bad' else 'Good')
如何评估 dplyr
中多行的条件?
我有一个数据集,我想根据多个时间段内发生的条件(转换)进行变异。
按照下面的例子,如果一个人通过了一个不好的状态,它必须被视为整体不好。我试过 mutate_if
但它不起作用,或者我可能无法理解语法
df <-
data.frame(ID = c(1,1,1,2,2,2,3,3,3),
Date= c(1,2,3,1,2,3,1,2,3),
Money = c(500,400,500,100,100,100,200,300,300),
Status = c("Good", "Bad", "Good", "Good","Good","Good", "Bad","Good","Good"))
你能给我提供一个解决方案来达到以下结果吗?如果可能的话,我更愿意呆在 dplyr
的范围内,尽管我知道 datatable
result <-
data.frame(ID = c(1,1,1,2,2,2,3,3,3),
Date= c(1,2,3,1,2,3,1,2,3),
Money = c(500,400,500,100,100,100,200,300,300),
Status = c("Good", "Bad", "Good", "Good","Good","Good", "Bad","Good","Good"),
Status_overall = c("Bad", "Bad", "Bad", "Good","Good","Good", "Bad","Bad","Bad"))
你可以 return 'Bad'
如果 any
Status
是 'Bad'
在 ID
.
library(dplyr)
df %>%
group_by(ID) %>%
mutate(Status_overall = if(any(Status == 'Bad')) 'Bad' else 'Good')
#Without if/else
#mutate(Status_overall = c('Good', 'Bad')[any(Status == 'Bad') + 1])
# ID Date Money Status Status_overall
# <dbl> <dbl> <dbl> <chr> <chr>
#1 1 1 500 Good Bad
#2 1 2 400 Bad Bad
#3 1 3 500 Good Bad
#4 2 1 100 Good Good
#5 2 2 100 Good Good
#6 2 3 100 Good Good
#7 3 1 200 Bad Bad
#8 3 2 300 Good Bad
#9 3 3 300 Good Bad
这可以用基数 R 和 data.table
写成:
df$Status_overall <- with(df, ifelse(ave(Status == 'Bad', ID, FUN = any), 'Bad', 'Good'))
library(data.table)
setDT(df)[, Status_overall := if(any(Status == 'Bad')) 'Bad' else 'Good', ID]
这个有用吗:
library(dplyr)
df %>% group_by(ID) %>%
mutate(Status_overall = case_when('Bad' %in% Status ~ 'Bad', TRUE ~ 'Good'))
# A tibble: 9 x 5
# Groups: ID [3]
ID Date Money Status Status_overall
<dbl> <dbl> <dbl> <chr> <chr>
1 1 1 500 Good Bad
2 1 2 400 Bad Bad
3 1 3 500 Good Bad
4 2 1 100 Good Good
5 2 2 100 Good Good
6 2 3 100 Good Good
7 3 1 200 Bad Bad
8 3 2 300 Good Bad
9 3 3 300 Good Bad
我们可以使用if
和 %in%
library(dplyr)
df %>%
group_by(ID) %>%
mutate(Status_overall = if('Bad' %in% Status) 'Bad' else 'Good')