填充数字和字符列,条件是上一个和下一个值相等

Fill both numerical and character columns, conditional on the previous and next value being equal

请注意我的问题不同于:

在下面的数据中,我想根据 areacodetype 的邮政编码 NA 的条件填写数字和字符列 NA =14=] 与 NA.

后邮政编码的 areacodetype 相同

换句话说:“因为邮政编码 1002 有粘土,邮政编码 1004 有粘土,我们假设邮政编码 1003 有粘土。”

我想用,但是na.fill只能填数值

dat <- structure(list(zipcode = c(1001, 1002, 1003, 1004), areacode = c(4, 
4, NA, 4), type = structure(c(3L, 3L, NA, 3L), .Label = c("", 
"sand", "clay", "na2"), class = "factor"), region = c(3, 3, 
NA, 3)), class = c("data.table", "data.frame"), row.names = c(NA, 
-4L))

   zipcode areacode type region
1:    1001        4 clay      3
2:    1002        4 clay      3
3:    1003       NA <NA>     NA
4:    1004        4 clay      3

dat2 <- structure(list(zipcode = c(1001, 1002, 1003, 1004), areacode = c(4, 
4, NA, 1), type = structure(c(3L, 3L, NA, 2L), .Label = c("", 
"sand", "clay", "na2"), class = "factor"), region = c(3, 3, NA, 
3)), class = c("data.table", "data.frame"), row.names = c(NA, 
-4L))

   zipcode areacode type region
1:    1001        4 clay      3
2:    1002        4 clay      3
3:    1003       NA <NA>     NA
4:    1004        1 sand      3

最好的方法是什么?

期望的输出dat

   zipcode areacode type region
1:    1001        4 clay      3
2:    1002        4 clay      3
3:    1003        4 clay      3
4:    1004        4 clay      3

期望的输出dat2

   zipcode areacode type region
1:    1001        4 clay      3
2:    1002        4 clay      3
3:    1003       NA <NA>     NA
4:    1004        1 sand      3

编辑:

下面是不够的,因为即使第四行说sand.

也会填clay
dat2 %>%
  fill(areacode, type, region)

   zipcode areacode type region
1:    1001        4 clay      3
2:    1002        4 clay      3
3:    1003        4 clay      3
4:    1004        1 sand      3

dat2[, lapply(.SD, zoo::na.locf)]

   zipcode areacode type region
1:    1001        4 clay      3
2:    1002        4 clay      3
3:    1003        4 clay      3
4:    1004        1 sand      3

使用dplyr:

library(dplyr)
dat2 |> 
  mutate(type = as.character(type)) |> 
  mutate(across(2:4,
                ~ ifelse(is.na(.) & lag(areacode) == lead(areacode) & lag(type) == lead(type),
                         lag(.),
                         .)))

  zipcode areacode type region
1    1001        4 clay      3
2    1002        4 clay      3
3    1003       NA <NA>     NA
4    1004        1 sand      3

dat |> 
  mutate(type = as.character(type)) |> 
  mutate(across(2:4,
                ~ ifelse(is.na(.) & lag(areacode) == lead(areacode) & lag(type) == lead(type),
                         lag(.),
                         .)))

  zipcode areacode type region
1    1001        4 clay      3
2    1002        4 clay      3
3    1003        4 clay      3
4    1004        4 clay      3