R 中有没有一种方法可以对具有不同缺失观察模式的列求和?
Is there a way in R to sumcolumns with different pattern of missing observations?
我有一些变量想加在一起,但其中一些缺少观察值,当加在一起时,它会使整行缺少一个或多个作为缺失。例如,假设我有以下最后一列作为我的期望
df <- matrix(c(23, NA, 56, NA, NA, 43, 67, NA, 11, 10, 18, 39), byrow = T, nrow = 3)
colnames(df)<- c("X", "y", "z", "sum")
df
X y z sum
[1,] 23 NA 56 NA
[2,] NA 43 67 NA
[3,] 11 10 18 39
Here is my expectation
df2 <- matrix(c(23, NA, 56, 79,
NA, 43, 67, 110,
11, 10, 18, 39), byrow = T, nrow = 3)
colnames(df2)<- c("X", "Y", "Z", "sum")
df2
X Y Z sum
[1,] 23 NA 56 79
[2,] NA 43 67 110
[3,] 11 10 18 39
How can I get this result?
I am using R version 3.6 on Window 10.
正如 Ben 指出的那样,我想你想要的只是 na.rm = TRUE
,所以像这样:
df <- matrix(c(23, NA, 56, NA, 43, 67, 11, 10, 18), byrow = T, nrow = 3)
colnames(df)<- c("X", "y", "z")
cbind(df, summ = rowSums(df, na.rm = TRUE))
# X y z summ
# [1,] 23 NA 56 79
# [2,] NA 43 67 110
# [3,] 11 10 18 39
或者,如果您使用的是数据框,则类似这样
library(dplyr)
df_frame <- data.frame(df)
df_frame <- df_frame %>%
mutate(summ = rowSums(., na.rm = TRUE))
df_frame
# X y z summ
# 1 23 NA 56 79
# 2 NA 43 67 110
# 3 11 10 18 39
#OR this if you just want to select numeric variables from the dataframe:
df_frame <- data.frame(df)
df_frame <- df_frame %>%
mutate(summ = rowSums(select_if(., is.numeric), na.rm = TRUE))
df_frame
# X y z summ
# 1 23 NA 56 79
# 2 NA 43 67 110
# 3 11 10 18 39
我有一些变量想加在一起,但其中一些缺少观察值,当加在一起时,它会使整行缺少一个或多个作为缺失。例如,假设我有以下最后一列作为我的期望
df <- matrix(c(23, NA, 56, NA, NA, 43, 67, NA, 11, 10, 18, 39), byrow = T, nrow = 3)
colnames(df)<- c("X", "y", "z", "sum")
df
X y z sum
[1,] 23 NA 56 NA
[2,] NA 43 67 NA
[3,] 11 10 18 39
Here is my expectation
df2 <- matrix(c(23, NA, 56, 79,
NA, 43, 67, 110,
11, 10, 18, 39), byrow = T, nrow = 3)
colnames(df2)<- c("X", "Y", "Z", "sum")
df2
X Y Z sum
[1,] 23 NA 56 79
[2,] NA 43 67 110
[3,] 11 10 18 39
How can I get this result?
I am using R version 3.6 on Window 10.
正如 Ben 指出的那样,我想你想要的只是 na.rm = TRUE
,所以像这样:
df <- matrix(c(23, NA, 56, NA, 43, 67, 11, 10, 18), byrow = T, nrow = 3)
colnames(df)<- c("X", "y", "z")
cbind(df, summ = rowSums(df, na.rm = TRUE))
# X y z summ
# [1,] 23 NA 56 79
# [2,] NA 43 67 110
# [3,] 11 10 18 39
或者,如果您使用的是数据框,则类似这样
library(dplyr)
df_frame <- data.frame(df)
df_frame <- df_frame %>%
mutate(summ = rowSums(., na.rm = TRUE))
df_frame
# X y z summ
# 1 23 NA 56 79
# 2 NA 43 67 110
# 3 11 10 18 39
#OR this if you just want to select numeric variables from the dataframe:
df_frame <- data.frame(df)
df_frame <- df_frame %>%
mutate(summ = rowSums(select_if(., is.numeric), na.rm = TRUE))
df_frame
# X y z summ
# 1 23 NA 56 79
# 2 NA 43 67 110
# 3 11 10 18 39