R:使用 'for' 循环和 'case_when.' 检查多个变量
R: check multiple variables using 'for' loop and 'case_when.'
我有一个数据集 DT 如下:
类别:数字1-9
xxx, yyy, zzz: 二进制 (0,1)
category xxx yyy zzz
8 1 0 0
1 0 0 0
4 0 1 1
9 0 0 1
8 0 1 0
我想使用 'for' 循环和 'case_when.'
检查多个条件
所以,希望数据显示在最后
category xxx yyy zzz result_xxx result_yyy result_zzz
8 1 0 0 8 0 0
1 0 0 0 0 0 0
4 0 1 1 0 4 4
9 0 0 1 0 0 9
8 0 1 0 0 8 0
为此,我在下面写了一段代码:
condition.vars <- c("xxx", "yyy", "zzz")
for(i in condition.vars){
browser()
DT <- DT[, condition:= case_when(
([[i]] == 1 & category ==1) ~ 1,
([[i]] == 1 & category ==2) ~ 2,
([[i]] == 1 & category ==3) ~ 3,
([[i]] == 1 & category ==4) ~ 4,
([[i]] == 1 & category ==5) ~ 5,
([[i]] == 1 & category ==6) ~ 6,
([[i]] == 1 & category ==7) ~ 7,
([[i]] == 1 & category ==8) ~ 8,
([[i]] == 1 & category ==9) ~ 9,
TRUE ~ 0
)]
setnames(DT, "condition", paste0("result", i))
}
如您所料,它不起作用。
你能帮我更正我的代码吗?
您不需要 for
循环或 case_when
。如果你有一个数据框,你可以将其简化为:
condition.vars <- c("xxx", "yyy", "zzz")
DT[paste0('result_', condition.vars)] <- DT$category * DT[condition.vars]
# category xxx yyy zzz result_xxx result_yyy result_zzz
#1 8 1 0 0 8 0 0
#2 1 0 0 0 0 0 0
#3 4 0 1 1 0 4 4
#4 9 0 0 1 0 0 9
#5 8 0 1 0 0 8 0
如果 DT
是 data.table
你可以这样做 :
library(data.table)
DT[, paste0('result_', condition.vars):= category * .SD,.SDcols = condition.vars]
我们可以使用tidyverse
library(dplyr)
DT %>%
mutate(across(c(xxx, yyy, zzz), ~
category * ., .names = "result_{.col}"))
这里有一个data.table
选项
setDT(df)[,
c(
df,
setNames(category * .SD, paste0("result_", names(.SD)))
),
.SDcols = xxx:zzz
]
这给出了
category xxx yyy zzz result_xxx result_yyy result_zzz
1: 8 1 0 0 8 0 0
2: 1 0 0 0 0 0 0
3: 4 0 1 1 0 4 4
4: 9 0 0 1 0 0 9
5: 8 0 1 0 0 8 0
我有一个数据集 DT 如下:
类别:数字1-9
xxx, yyy, zzz: 二进制 (0,1)
category xxx yyy zzz
8 1 0 0
1 0 0 0
4 0 1 1
9 0 0 1
8 0 1 0
我想使用 'for' 循环和 'case_when.'
检查多个条件
所以,希望数据显示在最后
category xxx yyy zzz result_xxx result_yyy result_zzz
8 1 0 0 8 0 0
1 0 0 0 0 0 0
4 0 1 1 0 4 4
9 0 0 1 0 0 9
8 0 1 0 0 8 0
为此,我在下面写了一段代码:
condition.vars <- c("xxx", "yyy", "zzz")
for(i in condition.vars){
browser()
DT <- DT[, condition:= case_when(
([[i]] == 1 & category ==1) ~ 1,
([[i]] == 1 & category ==2) ~ 2,
([[i]] == 1 & category ==3) ~ 3,
([[i]] == 1 & category ==4) ~ 4,
([[i]] == 1 & category ==5) ~ 5,
([[i]] == 1 & category ==6) ~ 6,
([[i]] == 1 & category ==7) ~ 7,
([[i]] == 1 & category ==8) ~ 8,
([[i]] == 1 & category ==9) ~ 9,
TRUE ~ 0
)]
setnames(DT, "condition", paste0("result", i))
}
如您所料,它不起作用。
你能帮我更正我的代码吗?
您不需要 for
循环或 case_when
。如果你有一个数据框,你可以将其简化为:
condition.vars <- c("xxx", "yyy", "zzz")
DT[paste0('result_', condition.vars)] <- DT$category * DT[condition.vars]
# category xxx yyy zzz result_xxx result_yyy result_zzz
#1 8 1 0 0 8 0 0
#2 1 0 0 0 0 0 0
#3 4 0 1 1 0 4 4
#4 9 0 0 1 0 0 9
#5 8 0 1 0 0 8 0
如果 DT
是 data.table
你可以这样做 :
library(data.table)
DT[, paste0('result_', condition.vars):= category * .SD,.SDcols = condition.vars]
我们可以使用tidyverse
library(dplyr)
DT %>%
mutate(across(c(xxx, yyy, zzz), ~
category * ., .names = "result_{.col}"))
这里有一个data.table
选项
setDT(df)[,
c(
df,
setNames(category * .SD, paste0("result_", names(.SD)))
),
.SDcols = xxx:zzz
]
这给出了
category xxx yyy zzz result_xxx result_yyy result_zzz
1: 8 1 0 0 8 0 0
2: 1 0 0 0 0 0 0
3: 4 0 1 1 0 4 4
4: 9 0 0 1 0 0 9
5: 8 0 1 0 0 8 0