如何将 PSPP table 重塑为另一种形式的 table
How to reshape a PSPP table into another form of table
我有一个按 PSPP 平均 Tables 的数据框。我想重塑它以便在 calc 中更容易地操纵它。
我想做什么?
- 这 Table 包含描述性统计数据,例如平均值、SD、N。
- 分类变量的水平是垂直填充的。
V1 V1_levelA, V1_levelB, | V2 V2_levelA, V2_levelB ...等
- 描述性统计是垂直显示的
我希望第一个变量水平填充,下一个变量垂直填充。请参阅所附图片以获取更多信息。
结果必须考虑到 Table 可能会缺失整个因子水平 - 因为它们可能没有 "values",因此不会以 table 的形式包含在输入中csv.
我希望现在我所问的这个巨大的编辑更加清晰。
Dput sample df similar to the done of the posted image:
df <- structure(list(structure(c(2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L,
1L, 1L, 1L, 1L, 1L, 1L), .Label = c("", "v1", "v2"), class = "factor"),
varA = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 1L, 1L,
2L, 3L, 3L, 4L, 4L), .Label = c("k1", "k2", "k3", "k4", "varA"
), class = "factor"), Age = structure(c(1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L), .Label = c("a1",
"a2", "Age"), class = "factor"), Mean = structure(1:15, .Label = c("10",
"11", "12", "13", "14", "15", "16", "17", "18", "19", "21",
"22", "23", "24", "25", "Mean"), class = "factor"), N = structure(c(1L,
8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 2L, 3L, 4L, 5L, 6L,
7L), .Label = c("1", "10", "12", "13", "14", "15", "16",
"2", "3", "4", "5", "6", "7", "8", "9", "N"), class = "factor")), row.names = 2:16, class = "data.frame")
*更新**
检查:
我的输入和所需的输出:
https://postimg.cc/N2GTZd09
我仍然不清楚你的预期输出,因为你的输入数据和预期输出不匹配。
除此之外,也许这就是您想要的?
library(tidyverse)
df %>%
rename(group = 1) %>% # Name first column
mutate_at(1, na_if, "") %>% # Replace "" with NA
fill(group) %>% # Fill first column with missing values
group_by(group) %>%
nest() %>% # Nest data by group
mutate(data = map(data, ~.x %>%
gather(k, v, -varA, -Age) %>% # Wide to long
unite(k, varA, k) %>% # Unite varA with variable column
spread(k, v))) %>% # Spread from long to wide
unnest() # Unnest
## A tibble: 4 x 10
# group Age k1_Mean k1_N k2_Mean k2_N k3_Mean k3_N k4_Mean k4_N
# <fct> <fct> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#1 v1 a1 10 1 12 3 14 5 16 7
#2 v1 a2 11 2 13 4 15 6 17 8
#3 v2 a1 18 9 NA NA 22 13 24 15
#4 v2 a2 19 10 21 12 23 14 25 16
我有一个按 PSPP 平均 Tables 的数据框。我想重塑它以便在 calc 中更容易地操纵它。
我想做什么?
- 这 Table 包含描述性统计数据,例如平均值、SD、N。
- 分类变量的水平是垂直填充的。
V1 V1_levelA, V1_levelB, | V2 V2_levelA, V2_levelB ...等 - 描述性统计是垂直显示的
我希望第一个变量水平填充,下一个变量垂直填充。请参阅所附图片以获取更多信息。
结果必须考虑到 Table 可能会缺失整个因子水平 - 因为它们可能没有 "values",因此不会以 table 的形式包含在输入中csv.
我希望现在我所问的这个巨大的编辑更加清晰。
Dput sample df similar to the done of the posted image:
df <- structure(list(structure(c(2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L,
1L, 1L, 1L, 1L, 1L, 1L), .Label = c("", "v1", "v2"), class = "factor"),
varA = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 1L, 1L,
2L, 3L, 3L, 4L, 4L), .Label = c("k1", "k2", "k3", "k4", "varA"
), class = "factor"), Age = structure(c(1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L), .Label = c("a1",
"a2", "Age"), class = "factor"), Mean = structure(1:15, .Label = c("10",
"11", "12", "13", "14", "15", "16", "17", "18", "19", "21",
"22", "23", "24", "25", "Mean"), class = "factor"), N = structure(c(1L,
8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 2L, 3L, 4L, 5L, 6L,
7L), .Label = c("1", "10", "12", "13", "14", "15", "16",
"2", "3", "4", "5", "6", "7", "8", "9", "N"), class = "factor")), row.names = 2:16, class = "data.frame")
*更新**
检查:
我的输入和所需的输出:
https://postimg.cc/N2GTZd09
我仍然不清楚你的预期输出,因为你的输入数据和预期输出不匹配。
除此之外,也许这就是您想要的?
library(tidyverse)
df %>%
rename(group = 1) %>% # Name first column
mutate_at(1, na_if, "") %>% # Replace "" with NA
fill(group) %>% # Fill first column with missing values
group_by(group) %>%
nest() %>% # Nest data by group
mutate(data = map(data, ~.x %>%
gather(k, v, -varA, -Age) %>% # Wide to long
unite(k, varA, k) %>% # Unite varA with variable column
spread(k, v))) %>% # Spread from long to wide
unnest() # Unnest
## A tibble: 4 x 10
# group Age k1_Mean k1_N k2_Mean k2_N k3_Mean k3_N k4_Mean k4_N
# <fct> <fct> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#1 v1 a1 10 1 12 3 14 5 16 7
#2 v1 a2 11 2 13 4 15 6 17 8
#3 v2 a1 18 9 NA NA 22 13 24 15
#4 v2 a2 19 10 21 12 23 14 25 16