在 R 中统计几列分类变量
Tally several columns of categorical variables in R
我有调查数据,受访者在李克特量表上对几个项目进行了评分,如下所示:
id item1 item2 item3 item4
42 Moderately adequate Completely adequate Very adequate Very adequate
48 Moderately adequate Moderately adequate Moderately adequate Moderately adequate
49 Moderately adequate Moderately adequate Moderately adequate Moderately adequate
50 Slightly adequate Slightly adequate Slightly adequate Not at all adequate
我想将其转换为一个数据结构,对于每个项目,它都有它收到的评级计数,如下所示:
rating item1 item2 item3 item4
Not at all adequate 0 0 0 1
Slightly adequate 1 1 1
Moderately adequate 3 2 2 2
Very adequate 0 0 1 1
Completely adequate 0 1 0 0
重塑此数据的最有效方法是什么?我已经试过 dcast(data = melt(data, id.vars = "id"), value ~.)
,但这是对所有四个项目的评分总和,而不是将每个项目保留在其自己的列中; count
和 tally
同样的问题。我可以逐项执行此操作,然后将这些列重新合并在一起,但似乎必须有一种更简单的方法,尤其是因为我需要在多个不同的项目列表中复制它。
以长格式获取数据,count
并以宽格式获取数据:
library(dplyr)
library(tidyr)
data %>%
pivot_longer(cols = -id) %>%
count(name, value) %>%
pivot_wider(names_from = name, values_from = n, values_fill = list(n = 0))
# A tibble: 5 x 5
# value item1 item2 item3 item4
# <chr> <int> <int> <int> <int>
#1 Moderately_adequate 3 2 2 2
#2 Slightly_adequate 1 1 1 0
#3 Completely_adequate 0 1 0 0
#4 Very_adequate 0 0 1 1
#5 Not_at_all_adequate 0 0 0 1
数据
我在 item
列的值中添加了下划线,因为很难复制带有空格的数据。
data <- structure(list(id = c(42L, 48L, 49L, 50L),item1 = c("Moderately_adequate",
"Moderately_adequate", "Moderately_adequate", "Slightly_adequate"
), item2 = c("Completely_adequate", "Moderately_adequate", "Moderately_adequate",
"Slightly_adequate"), item3 = c("Very_adequate", "Moderately_adequate",
"Moderately_adequate", "Slightly_adequate"), item4 = c("Very_adequate",
"Moderately_adequate", "Moderately_adequate", "Not_at_all_adequate"
)), class = "data.frame", row.names = c(NA, -4L))
我有调查数据,受访者在李克特量表上对几个项目进行了评分,如下所示:
id item1 item2 item3 item4
42 Moderately adequate Completely adequate Very adequate Very adequate
48 Moderately adequate Moderately adequate Moderately adequate Moderately adequate
49 Moderately adequate Moderately adequate Moderately adequate Moderately adequate
50 Slightly adequate Slightly adequate Slightly adequate Not at all adequate
我想将其转换为一个数据结构,对于每个项目,它都有它收到的评级计数,如下所示:
rating item1 item2 item3 item4
Not at all adequate 0 0 0 1
Slightly adequate 1 1 1
Moderately adequate 3 2 2 2
Very adequate 0 0 1 1
Completely adequate 0 1 0 0
重塑此数据的最有效方法是什么?我已经试过 dcast(data = melt(data, id.vars = "id"), value ~.)
,但这是对所有四个项目的评分总和,而不是将每个项目保留在其自己的列中; count
和 tally
同样的问题。我可以逐项执行此操作,然后将这些列重新合并在一起,但似乎必须有一种更简单的方法,尤其是因为我需要在多个不同的项目列表中复制它。
以长格式获取数据,count
并以宽格式获取数据:
library(dplyr)
library(tidyr)
data %>%
pivot_longer(cols = -id) %>%
count(name, value) %>%
pivot_wider(names_from = name, values_from = n, values_fill = list(n = 0))
# A tibble: 5 x 5
# value item1 item2 item3 item4
# <chr> <int> <int> <int> <int>
#1 Moderately_adequate 3 2 2 2
#2 Slightly_adequate 1 1 1 0
#3 Completely_adequate 0 1 0 0
#4 Very_adequate 0 0 1 1
#5 Not_at_all_adequate 0 0 0 1
数据
我在 item
列的值中添加了下划线,因为很难复制带有空格的数据。
data <- structure(list(id = c(42L, 48L, 49L, 50L),item1 = c("Moderately_adequate",
"Moderately_adequate", "Moderately_adequate", "Slightly_adequate"
), item2 = c("Completely_adequate", "Moderately_adequate", "Moderately_adequate",
"Slightly_adequate"), item3 = c("Very_adequate", "Moderately_adequate",
"Moderately_adequate", "Slightly_adequate"), item4 = c("Very_adequate",
"Moderately_adequate", "Moderately_adequate", "Not_at_all_adequate"
)), class = "data.frame", row.names = c(NA, -4L))