dplyr:根据选择的列中的条件替换多个值
dplyr: Replace multiple values based on condition in a selection of columns
我尝试有条件地替换数据框中的多个值。
在下面的数据集中,我想在 3:5 列中将 2 的所有值替换为“X”,将 3 的所有值替换为“Y”,但仅限于 measure == "led"
条件:测量==“led”
替换:值“2”替换为“X”,值“3”替换为“Y”(在 3:5 列中)
library(data.table)
dt <- data.table(measure = sample(c('cfl', 'led', 'linear', 'exit'), 20, replace=T),
site = sample(1:6, 20, replace=T),
space = sample(1:4, 20, replace=T),
qty = sample(1:6, 20, replace=T),
qty.exit = sample(1:6, 20, replace=T),
cf = sample(1:6, 20, replace=T))
有没有简单的 dplyr 解决方案?非常感谢!
一个dplyr
解决方案:
library(dplyr)
dt %>%
mutate(across(3:5, ~ ifelse(measure == "led", stringr::str_replace_all(
as.character(.),
c("2" = "X", "3" = "Y")
), .)))
结果:
measure site space qty qty.exit cf
1: led 4 1 4 6 3
2: exit 4 2 1 4 6
3: cfl 1 4 6 2 3
4: linear 3 4 1 3 5
5: cfl 5 1 6 1 6
6: exit 4 3 2 6 4
7: exit 5 1 4 2 5
8: exit 1 4 3 6 4
9: linear 3 1 5 4 1
10: led 4 1 1 1 1
11: exit 5 4 3 5 2
12: cfl 4 2 4 5 5
13: led 4 X Y Y 4
...
为了完整起见,这里有一个 data.table 解决方案,使用 look-up table 和 更新加入.
如 中所述,列的所有元素必须具有相同的类型。所以,在加入之前,必须强制所有数字键入字符。
library(data.table)
# create look-up table
lut <- fread(
"measure, old, new
led, 2, X
led, 3, Y",
colClasses = c(old = "character")
)
# perform coersion and update join for each column
for(col in colnames(dt)[3:5]) {
set(dt, , col, as.character(dt[[col]]))
dt[lut, on = c("measure", paste0(col, "==old")), (col) := new][]
}
dt
measure site space qty qty.exit cf
1: cfl 3 4 4 3 3
2: cfl 1 2 3 5 5
3: cfl 1 2 6 5 5
4: cfl 3 1 5 4 6
5: led 4 4 X 6 3
6: exit 5 1 6 5 6
7: led 5 4 X 4 4
8: led 5 X X 6 5
9: cfl 4 4 5 2 1
10: exit 2 2 1 2 4
11: linear 4 3 1 1 1
12: exit 3 4 4 2 1
13: linear 2 3 5 5 5
14: exit 1 1 2 6 6
15: cfl 2 1 1 5 3
16: cfl 6 2 5 4 1
17: led 3 X 4 1 2
18: exit 6 2 4 4 5
19: led 2 X 1 X 6
20: led 4 X Y X 1
数据
library(data.table)
set.seed(42) # required for reproducible data
dt <- data.table(measure = sample(c('cfl', 'led', 'linear', 'exit'), 20, replace=T),
site = sample(1:6, 20, replace=T),
space = sample(1:4, 20, replace=T),
qty = sample(1:6, 20, replace=T),
qty.exit = sample(1:6, 20, replace=T),
cf = sample(1:6, 20, replace=T))
我尝试有条件地替换数据框中的多个值。
在下面的数据集中,我想在 3:5 列中将 2 的所有值替换为“X”,将 3 的所有值替换为“Y”,但仅限于 measure == "led"
条件:测量==“led” 替换:值“2”替换为“X”,值“3”替换为“Y”(在 3:5 列中)
library(data.table)
dt <- data.table(measure = sample(c('cfl', 'led', 'linear', 'exit'), 20, replace=T),
site = sample(1:6, 20, replace=T),
space = sample(1:4, 20, replace=T),
qty = sample(1:6, 20, replace=T),
qty.exit = sample(1:6, 20, replace=T),
cf = sample(1:6, 20, replace=T))
有没有简单的 dplyr 解决方案?非常感谢!
一个dplyr
解决方案:
library(dplyr)
dt %>%
mutate(across(3:5, ~ ifelse(measure == "led", stringr::str_replace_all(
as.character(.),
c("2" = "X", "3" = "Y")
), .)))
结果:
measure site space qty qty.exit cf
1: led 4 1 4 6 3
2: exit 4 2 1 4 6
3: cfl 1 4 6 2 3
4: linear 3 4 1 3 5
5: cfl 5 1 6 1 6
6: exit 4 3 2 6 4
7: exit 5 1 4 2 5
8: exit 1 4 3 6 4
9: linear 3 1 5 4 1
10: led 4 1 1 1 1
11: exit 5 4 3 5 2
12: cfl 4 2 4 5 5
13: led 4 X Y Y 4
...
为了完整起见,这里有一个 data.table 解决方案,使用 look-up table 和 更新加入.
如
library(data.table)
# create look-up table
lut <- fread(
"measure, old, new
led, 2, X
led, 3, Y",
colClasses = c(old = "character")
)
# perform coersion and update join for each column
for(col in colnames(dt)[3:5]) {
set(dt, , col, as.character(dt[[col]]))
dt[lut, on = c("measure", paste0(col, "==old")), (col) := new][]
}
dt
measure site space qty qty.exit cf 1: cfl 3 4 4 3 3 2: cfl 1 2 3 5 5 3: cfl 1 2 6 5 5 4: cfl 3 1 5 4 6 5: led 4 4 X 6 3 6: exit 5 1 6 5 6 7: led 5 4 X 4 4 8: led 5 X X 6 5 9: cfl 4 4 5 2 1 10: exit 2 2 1 2 4 11: linear 4 3 1 1 1 12: exit 3 4 4 2 1 13: linear 2 3 5 5 5 14: exit 1 1 2 6 6 15: cfl 2 1 1 5 3 16: cfl 6 2 5 4 1 17: led 3 X 4 1 2 18: exit 6 2 4 4 5 19: led 2 X 1 X 6 20: led 4 X Y X 1
数据
library(data.table)
set.seed(42) # required for reproducible data
dt <- data.table(measure = sample(c('cfl', 'led', 'linear', 'exit'), 20, replace=T),
site = sample(1:6, 20, replace=T),
space = sample(1:4, 20, replace=T),
qty = sample(1:6, 20, replace=T),
qty.exit = sample(1:6, 20, replace=T),
cf = sample(1:6, 20, replace=T))