dplyr:根据选择的列中的条件替换多个值

dplyr: Replace multiple values based on condition in a selection of columns

我尝试有条件地替换数据框中的多个值。

在下面的数据集中,我想在 3:5 列中将 2 的所有值替换为“X”,将 3 的所有值替换为“Y”,但仅限于 measure == "led"

条件:测量==“led” 替换:值“2”替换为“X”,值“3”替换为“Y”(在 3:5 列中)

library(data.table)
dt <- data.table(measure = sample(c('cfl', 'led', 'linear', 'exit'), 20, replace=T),
                 site = sample(1:6, 20, replace=T),
                 space = sample(1:4, 20, replace=T),
                 qty = sample(1:6, 20, replace=T),
                 qty.exit = sample(1:6, 20, replace=T),
                 cf = sample(1:6, 20, replace=T))

有没有简单的 dplyr 解决方案?非常感谢!

一个dplyr解决方案:

library(dplyr)
dt %>%
  mutate(across(3:5, ~ ifelse(measure == "led", stringr::str_replace_all(
    as.character(.),
    c("2" = "X", "3" = "Y")
  ), .)))

结果:

   measure site space qty qty.exit cf
 1:     led    4     1   4        6  3
 2:    exit    4     2   1        4  6
 3:     cfl    1     4   6        2  3
 4:  linear    3     4   1        3  5
 5:     cfl    5     1   6        1  6
 6:    exit    4     3   2        6  4
 7:    exit    5     1   4        2  5
 8:    exit    1     4   3        6  4
 9:  linear    3     1   5        4  1
10:     led    4     1   1        1  1
11:    exit    5     4   3        5  2
12:     cfl    4     2   4        5  5
13:     led    4     X   Y        Y  4
...

为了完整起见,这里有一个 data.table 解决方案,使用 look-up table 更新加入.

中所述,列的所有元素必须具有相同的类型。所以,在加入之前,必须强制所有数字键入字符。

library(data.table)
# create look-up table
lut <- fread(
  "measure, old, new
  led,      2,   X
  led,      3,   Y",
  colClasses = c(old = "character")
)
# perform coersion and update join for each column
for(col in colnames(dt)[3:5]) {
  set(dt, , col, as.character(dt[[col]]))
  dt[lut, on = c("measure", paste0(col, "==old")), (col) := new][]
}
dt
    measure site space qty qty.exit cf
 1:     cfl    3     4   4        3  3
 2:     cfl    1     2   3        5  5
 3:     cfl    1     2   6        5  5
 4:     cfl    3     1   5        4  6
 5:     led    4     4   X        6  3
 6:    exit    5     1   6        5  6
 7:     led    5     4   X        4  4
 8:     led    5     X   X        6  5
 9:     cfl    4     4   5        2  1
10:    exit    2     2   1        2  4
11:  linear    4     3   1        1  1
12:    exit    3     4   4        2  1
13:  linear    2     3   5        5  5
14:    exit    1     1   2        6  6
15:     cfl    2     1   1        5  3
16:     cfl    6     2   5        4  1
17:     led    3     X   4        1  2
18:    exit    6     2   4        4  5
19:     led    2     X   1        X  6
20:     led    4     X   Y        X  1

数据

library(data.table)
set.seed(42) # required for reproducible data
dt <- data.table(measure = sample(c('cfl', 'led', 'linear', 'exit'), 20, replace=T),
                 site = sample(1:6, 20, replace=T),
                 space = sample(1:4, 20, replace=T),
                 qty = sample(1:6, 20, replace=T),
                 qty.exit = sample(1:6, 20, replace=T),
                 cf = sample(1:6, 20, replace=T))