有没有办法在 R 中制作具有不同变量对的双向表？

Question

我正在清理计算能力测试的数据。

有些测试项目是多项选择题，学生可以选择其中一个选项（例如 a)、b) 或 c)）。

在数据集中，我通过将项目转换为二进制变量来创建新变量。例如，如果正确答案是 a) 对于 Item1，我通过重新编码 a) = 1 和 otherwise = 0 制作了 newItem_1（NA 保持原样）。

我想通过 table-ing 原始变量和新变量来仔细检查重新编码是否成功完成。只做一对（在这种情况下 Item1 和 newItem_1）很容易，但是因为我有很多这样的多项选择题，所以写一个脚本到 table 效率不高每对一对。

这是我的问题：有什么方法可以用这些原始变量和新变量中的每对来制作 2-way tables 吗？我尝试通过for循环来做到这一点，并在网上查找提示，但到目前为止找不到解决方案。

我提取了下面的部分数据框。

structure(list(ID = 1:20, gender = c("Male", "Male", "Male", 
"Male", "Male", "Male", "Male", "Male", "Male", "Male", "Male", 
"Male", "Male", "Male", "Male", "Female", "Female", "Female", 
"Female", "Female"), Item1 = c("c", "c", "a", "a", NA, "c", "c", 
"b", "b", "b", "c", "c", NA, "c", "a", "d", "c", "c", "c", "c"
), Item2 = c("d", "d", "d", "d", "d", "a", "a", "a", "a", "b", 
"b", "c", "c", "c", "c", "d", NA, NA, "d", "d"), Item3 = c("b", 
"d", NA, "a", NA, "d", "c", "c", NA, "d", "c", NA, NA, "c", "d", 
"c", "d", "d", "d", "d"), new_Item1 = c(1L, 1L, 0L, 0L, NA, 1L, 
1L, 0L, 0L, 0L, 1L, 1L, NA, 1L, 0L, 0L, 1L, 1L, 1L, 1L), new_Item2 = c(1L, 
1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, NA, 
NA, 1L, 1L), new_Item3 = c(0L, 0L, NA, 0L, NA, 0L, 1L, 1L, NA, 
0L, 1L, NA, NA, 1L, 0L, 1L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA, 
-20L))

非常感谢。

顺

对于一对，我只需输入：图书馆（看门人） tabyl (g3, Item1, new_Item1) 我可以看到我的重新编码是正确的。但在这种情况下，我想通过 Item1、2 和 3（以及更多）循环相同的表格。所以我的预期输出会是这样的（如果我使用 tabyl）：
------------------
项目 1 1 0 NA
###
b###
c###
d###
不适用###

项目 2 1 0 NA
###
b###
c###
d###
.....
----------------------
我希望我的解释清楚。

Answer 1

您是否有不想使用基本 r table 函数的原因？看起来你会得到你想要的东西：

table(g3$Item1, g3$new_Item1, useNA="always")

其中 g3 是您在上面定义的数据框。

如果您想以不同的方式为循环定义对，我建议如下：

x = "Item1"
table(g3[, colnames(g3)==x], g3[, colnames(g3)==paste0("new_",x)], useNA="always")

其中 x 是您的循环变量。您可以通过这种方式比较“x”与“new_x”，而无需手动配对 table 函数中的每一列。您只需要将 x 的列表输入到您的循环中。

输出为：

        0  1 <NA>
  a     3  0    0
  b     3  0    0
  c     0 11    0
  d     1  0    0
  <NA>  0  0    2

Answer 2

您可以获取变量中的列名，并使用 Map 遍历每一对并 return 进行比较 table。

library(janitor)
x <- grep('^Item\d+$', names(df), value = TRUE)
y <- grep('^new_Item\d+$', names(df), value = TRUE)

Map(function(p, q) tabyl(df, .data[[p]], .data[[q]]), x, y)

#$Item1
# Item1 0  1 NA_
#     a 3  0   0
#     b 3  0   0
#     c 0 11   0
#     d 1  0   0
#  <NA> 0  0   2

#$Item2
# Item2 0 1 NA_
#     a 4 0   0
#     b 2 0   0
#     c 4 0   0
#     d 0 8   0
#  <NA> 0 0   2

#$Item3
# Item3 0 1 NA_
#     a 1 0   0
#     b 1 0   0
#     c 0 5   0
#     d 8 0   0
#  <NA> 0 0   5

Answer 3

如果有更多列，相应地扩展 truanswer table。其余的保持不变。您只需使用 accros 函数进行一次突变。

P.S。我假设您的数据在 dforg table.

library(tidyverse)
df = dforg %>% as_tibble() %>% select(ID:Item3)

truanswer = tribble(
  ~col,  ~answer,
  "Item1", "a",
  "Item2", "c",
  "Item3", "b"
)
  
fcheckanswer = function(x, col) 
  ifelse(x==truanswer$answer[truanswer$col==col], 1, 0)


df %>% mutate(
  across(starts_with("Item"), ~  fcheckanswer(.x, cur_column()),  .names = "{.col}_1"))

输出

# A tibble: 20 x 8
      ID gender Item1 Item2 Item3 Item1_1 Item2_1 Item3_1
   <int> <chr>  <chr> <chr> <chr>   <dbl>   <dbl>   <dbl>
 1     1 Male   c     d     b           0       0       1
 2     2 Male   c     d     d           0       0       0
 3     3 Male   a     d     NA          1       0      NA
 4     4 Male   a     d     a           1       0       0
 5     5 Male   NA    d     NA         NA       0      NA
 6     6 Male   c     a     d           0       0       0
 7     7 Male   c     a     c           0       0       0
 8     8 Male   b     a     c           0       0       0
 9     9 Male   b     a     NA          0       0      NA
10    10 Male   b     b     d           0       0       0
11    11 Male   c     b     c           0       0       0
12    12 Male   c     c     NA          0       1      NA
13    13 Male   NA    c     NA         NA       1      NA
14    14 Male   c     c     c           0       1       0
15    15 Male   a     c     d           1       1       0
16    16 Female d     d     c           0       0       0
17    17 Female c     NA    d           0      NA       0
18    18 Female c     NA    d           0      NA       0
19    19 Female c     d     d           0       0       0
20    20 Female c     d     d           0       0       0

有没有办法在 R 中制作具有不同变量对的双向表？

Is there a way to make 2-way tables with different pairs of variable in R?

loops

for-loop

r

crosstab