组内填充缺失值

Filling missing value in group

我的数据框缺少一些值

A 1
A NA
A NA
B NA
B 2
B NA
C NA
C NA
C NA

如何填写我有数据的群组?

我们可以使用data.table。将 'data.frame' 转换为 'data.table' (setDT(df1)),按 'ID' 分组,我们分配 (:=) 列 'v1' 作为第一个非NA 值。

library(data.table)
setDT(df1)[, v1:= v1[!is.na(v1)][1L] , by = ID]
df1
#   ID v1
#1:  A  1
#2:  A  1
#3:  A  1
#4:  B  2
#5:  B  2
#6:  B  2
#7:  C NA
#8:  C NA
#9:  C NA

或仅使用 base R

 with(df1, ave(v1, ID, FUN = function(x)
          replace(x, is.na(x), x[!is.na(x)][1L])))
 #[1]  1  1  1  2  2  2 NA NA NA

数据

df1 <- structure(list(ID = c("A", "A", "A", "B", "B", "B", "C", "C", 
"C"), v1 = c(1L, NA, NA, NA, 2L, NA, NA, NA, NA)), .Names = c("ID", 
"v1"), class = "data.frame", row.names = c(NA, -9L))

替代解决方案,尽管它做出的假设数量可能有点缺陷:

library(dplyr)
y %>%
  group_by(V1) %>%
  arrange(V2) %>%
  mutate(V2 = V2[1])
# Source: local data frame [9 x 2]
# Groups: V1 [3]
#      V1    V2
#   (chr) (int)
# 1     A     1
# 2     A     1
# 3     A     1
# 4     B     2
# 5     B     2
# 6     B     2
# 7     C    NA
# 8     C    NA
# 9     C    NA

您还可以使用 fill 来自 tidyr:

library(dplyr)
library(tidyr)

df1 %>%
  group_by(ID) %>%
  fill(v1) %>%
  fill(v1, .direction = "up")

结果:

# A tibble: 9 x 2
# Groups:   ID [3]
     ID    v1
  <chr> <int>
1     A     1
2     A     1
3     A     1
4     B     2
5     B     2
6     B     2
7     C    NA
8     C    NA
9     C    NA

感谢@akrun dput