R 中特定列的总和 NA
Sum NA across specific columns in R
我有这样的数据:
data_in <- read_table2("Id Q62_1 Q62_2 Q3_1 Q3_2 Q3_3 Q3_4 Q3_5
1 Yes Sometimes
2 Always
3
4 No Always Yes
5
6 Always No Likely Yes Always Always
7 Yes Sometimes Maybe Unlikely Sometimes Sometimes
8 Always Yes Likely No Always Always
9 Sometimes Unlikely Sometimes Sometimes
10 No No Likely Maybe
11 Sometimes Maybe Unlikely Sometimes Sometimes
12 Always Yes Likely Always Always
")
我想计算以 Q62 开头的列中 缺失响应 的数量,然后分别从 Q3_1 到 Q3_5 列中计算。
我知道 rowSums 可以很方便地对数值变量求和,但是有 dplyr/piped 相当于对 na 求和吗?
例如,如果这是数字数据并且我想对 q62 系列求和,我可以使用以下内容:
data_in %>%
mutate(Q62_NA = rowSums(select(.,"Q62_1", "Q62_2"))
但是我如何对 NA 求和?
我的输出应该是这样的:
data_out <- read_table2("Id Q62_1 Q62_2 Q3_1 Q3_2 Q3_3 Q3_4 Q3_5 Q62_NA Q3_NA
1 Yes Sometimes 0 5
2 Always 1 5
3 2 5
4 No Always Yes 0 5
5 2 5
6 Always No Likely Yes Always Always 1
7 Yes Sometimes Maybe Unlikely Sometimes Sometimes 0 1
8 Always Yes Likely No Always Always 1 0
9 Sometimes Unlikely Sometimes Sometimes 1 1
10 No No Likely Maybe 1 2
11 Sometimes Maybe Unlikely Sometimes Sometimes 1 1
12 Always Yes Likely Always Always 1 1
")
谢谢!!
我们可以用 is.na
包装 select
以将其转换为逻辑 matrix
然后在该矩阵上执行 rowSums
以求和 TRUE 元素的数量每行
library(dplyr)
data_in %>%
mutate(Q62_NA = rowSums(is.na(select(.,"Q62_1", "Q62_2"))))
或者 c_across
和 rowwise
的选项
data_in %>%
rowwise %>%
mutate(Q62_NA = sum(is.na(c_across(starts_with('Q6')))))
这是一个基本的 R 选项
transform(
data_in,
Q62_NA = rowSums(is.na(data_in[grepl("Q62",names(data_in))])),
Q3_NA = rowSums(is.na(data_in[grepl("Q3",names(data_in))]))
)
这给出了
Id Q62_1 Q62_2 Q3_1 Q3_2 Q3_3 Q3_4 Q3_5 Q62_NA
1 1 Yes Sometimes <NA> <NA> <NA> <NA> <NA> 0
2 2 Always <NA> <NA> <NA> <NA> <NA> <NA> 1
3 3 <NA> <NA> <NA> <NA> <NA> <NA> <NA> 2
4 4 No Always Yes <NA> <NA> <NA> <NA> 0
5 5 <NA> <NA> <NA> <NA> <NA> <NA> <NA> 2
6 6 Always No Likely Yes Always Always <NA> 0
7 7 Yes Sometimes Maybe Unlikely Sometimes Sometimes <NA> 0
8 8 Always Yes Likely No Always Always <NA> 0
9 9 Sometimes Unlikely Sometimes Sometimes <NA> <NA> <NA> 0
10 10 No No Likely Maybe <NA> <NA> <NA> 0
11 11 Sometimes Maybe Unlikely Sometimes Sometimes <NA> <NA> 0
12 12 Always Yes Likely Always Always <NA> <NA> 0
Q3_NA
1 5
2 5
3 5
4 4
5 5
6 1
7 1
8 1
9 3
10 3
11 2
12 2
我有这样的数据:
data_in <- read_table2("Id Q62_1 Q62_2 Q3_1 Q3_2 Q3_3 Q3_4 Q3_5
1 Yes Sometimes
2 Always
3
4 No Always Yes
5
6 Always No Likely Yes Always Always
7 Yes Sometimes Maybe Unlikely Sometimes Sometimes
8 Always Yes Likely No Always Always
9 Sometimes Unlikely Sometimes Sometimes
10 No No Likely Maybe
11 Sometimes Maybe Unlikely Sometimes Sometimes
12 Always Yes Likely Always Always
")
我想计算以 Q62 开头的列中 缺失响应 的数量,然后分别从 Q3_1 到 Q3_5 列中计算。
我知道 rowSums 可以很方便地对数值变量求和,但是有 dplyr/piped 相当于对 na 求和吗?
例如,如果这是数字数据并且我想对 q62 系列求和,我可以使用以下内容:
data_in %>%
mutate(Q62_NA = rowSums(select(.,"Q62_1", "Q62_2"))
但是我如何对 NA 求和?
我的输出应该是这样的:
data_out <- read_table2("Id Q62_1 Q62_2 Q3_1 Q3_2 Q3_3 Q3_4 Q3_5 Q62_NA Q3_NA
1 Yes Sometimes 0 5
2 Always 1 5
3 2 5
4 No Always Yes 0 5
5 2 5
6 Always No Likely Yes Always Always 1
7 Yes Sometimes Maybe Unlikely Sometimes Sometimes 0 1
8 Always Yes Likely No Always Always 1 0
9 Sometimes Unlikely Sometimes Sometimes 1 1
10 No No Likely Maybe 1 2
11 Sometimes Maybe Unlikely Sometimes Sometimes 1 1
12 Always Yes Likely Always Always 1 1
")
谢谢!!
我们可以用 is.na
包装 select
以将其转换为逻辑 matrix
然后在该矩阵上执行 rowSums
以求和 TRUE 元素的数量每行
library(dplyr)
data_in %>%
mutate(Q62_NA = rowSums(is.na(select(.,"Q62_1", "Q62_2"))))
或者 c_across
和 rowwise
data_in %>%
rowwise %>%
mutate(Q62_NA = sum(is.na(c_across(starts_with('Q6')))))
这是一个基本的 R 选项
transform(
data_in,
Q62_NA = rowSums(is.na(data_in[grepl("Q62",names(data_in))])),
Q3_NA = rowSums(is.na(data_in[grepl("Q3",names(data_in))]))
)
这给出了
Id Q62_1 Q62_2 Q3_1 Q3_2 Q3_3 Q3_4 Q3_5 Q62_NA
1 1 Yes Sometimes <NA> <NA> <NA> <NA> <NA> 0
2 2 Always <NA> <NA> <NA> <NA> <NA> <NA> 1
3 3 <NA> <NA> <NA> <NA> <NA> <NA> <NA> 2
4 4 No Always Yes <NA> <NA> <NA> <NA> 0
5 5 <NA> <NA> <NA> <NA> <NA> <NA> <NA> 2
6 6 Always No Likely Yes Always Always <NA> 0
7 7 Yes Sometimes Maybe Unlikely Sometimes Sometimes <NA> 0
8 8 Always Yes Likely No Always Always <NA> 0
9 9 Sometimes Unlikely Sometimes Sometimes <NA> <NA> <NA> 0
10 10 No No Likely Maybe <NA> <NA> <NA> 0
11 11 Sometimes Maybe Unlikely Sometimes Sometimes <NA> <NA> 0
12 12 Always Yes Likely Always Always <NA> <NA> 0
Q3_NA
1 5
2 5
3 5
4 4
5 5
6 1
7 1
8 1
9 3
10 3
11 2
12 2