旋转 table 并分离列
Pivoting table with separation of columns
下面你可以看到我的 table.
的一个简短示例
library(data.table)
library(dplyr)
Table2<-structure(list(Participant = c("ER", "EA"), Country = c("Belgium",
"Bulgaria"), Y_0_4.Male = c(0, 0), Y_0_4.Female = c(0, 0), Y_5_9.Male = c(0,
3), Y_5_9.Female = c(5, 0), Total = c(5, 3), Data = c(2018, 2018
)), row.names = c(NA, -2L), class = c("data.table", "data.frame"
))
现在我想用我的 table 做两件事。
首先是将包含年龄的列(例如 Y_0_4 和 Y_5_9 )分隔在标题为年龄的单独列中,以及
第二个是将包含单词 Female 和 Male 的标题分隔成两个单独的 columns.Below 你看起来像 table.
谁能帮我解决这个问题?
您可以使用 pivot_longer
来自 tidyr
:
library(tidyr)
library(dplyr)
pivot_longer(Table2, matches('\.'), names_sep = '\.', names_to = c('Age', '.value')) %>%
mutate(Total = Male + Female)
#> # A tibble: 4 x 7
#> Participant Country Total Data Age Male Female
#> <chr> <chr> <dbl> <dbl> <chr> <dbl> <dbl>
#> 1 ER Belgium 0 2018 Y_0_4 0 0
#> 2 ER Belgium 5 2018 Y_5_9 0 5
#> 3 EA Bulgaria 0 2018 Y_0_4 0 0
#> 4 EA Bulgaria 3 2018 Y_5_9 3 0
您可以使用 data.table
库中的 melt()
:
Reprex
- 代码
library(data.table)
melt(Table2,
id.vars = c("Participant", "Country", "Data"),
measure.vars = patterns("\d\.M", "\d\.F"),
variable.name = "Age",
value.name = c("Male", "Female"))[, `:=` (Age = tstrsplit(grep("\d\.[MF]", names(Table2), value = TRUE),"\.")[[1]], Total = Male + Female)][order(Country),][]
- 输出
#> Participant Country Data Age Male Female Total
#> 1: ER Belgium 2018 Y_0_4 0 0 0
#> 2: ER Belgium 2018 Y_5_9 0 5 5
#> 3: EA Bulgaria 2018 Y_0_4 0 0 0
#> 4: EA Bulgaria 2018 Y_5_9 3 0 3
由 reprex package (v2.0.1)
创建于 2022-03-14
下面你可以看到我的 table.
的一个简短示例 library(data.table)
library(dplyr)
Table2<-structure(list(Participant = c("ER", "EA"), Country = c("Belgium",
"Bulgaria"), Y_0_4.Male = c(0, 0), Y_0_4.Female = c(0, 0), Y_5_9.Male = c(0,
3), Y_5_9.Female = c(5, 0), Total = c(5, 3), Data = c(2018, 2018
)), row.names = c(NA, -2L), class = c("data.table", "data.frame"
))
现在我想用我的 table 做两件事。
首先是将包含年龄的列(例如 Y_0_4 和 Y_5_9 )分隔在标题为年龄的单独列中,以及 第二个是将包含单词 Female 和 Male 的标题分隔成两个单独的 columns.Below 你看起来像 table.
谁能帮我解决这个问题?
您可以使用 pivot_longer
来自 tidyr
:
library(tidyr)
library(dplyr)
pivot_longer(Table2, matches('\.'), names_sep = '\.', names_to = c('Age', '.value')) %>%
mutate(Total = Male + Female)
#> # A tibble: 4 x 7
#> Participant Country Total Data Age Male Female
#> <chr> <chr> <dbl> <dbl> <chr> <dbl> <dbl>
#> 1 ER Belgium 0 2018 Y_0_4 0 0
#> 2 ER Belgium 5 2018 Y_5_9 0 5
#> 3 EA Bulgaria 0 2018 Y_0_4 0 0
#> 4 EA Bulgaria 3 2018 Y_5_9 3 0
您可以使用 data.table
库中的 melt()
:
Reprex
- 代码
library(data.table)
melt(Table2,
id.vars = c("Participant", "Country", "Data"),
measure.vars = patterns("\d\.M", "\d\.F"),
variable.name = "Age",
value.name = c("Male", "Female"))[, `:=` (Age = tstrsplit(grep("\d\.[MF]", names(Table2), value = TRUE),"\.")[[1]], Total = Male + Female)][order(Country),][]
- 输出
#> Participant Country Data Age Male Female Total
#> 1: ER Belgium 2018 Y_0_4 0 0 0
#> 2: ER Belgium 2018 Y_5_9 0 5 5
#> 3: EA Bulgaria 2018 Y_0_4 0 0 0
#> 4: EA Bulgaria 2018 Y_5_9 3 0 3
由 reprex package (v2.0.1)
创建于 2022-03-14