重新排序我的重塑:从长到宽 pivot_wider,不同的列顺序
reordering my reshape: long to wide with pivot_wider, different column order
我需要将一个长数据集(下面的 df)重塑为宽数据集,其中多个变量在给定 ID 的长条目中是相同的,而其他变量则逐行更改。虚拟数据如下:
ID = c("A", "A", "B", "B", "B", "C", "C")
Name = c("mary", "mary", "berry", "berry", "berry", "paul", "paul")
Set = c("set1", "set2", "set1", "set2", "set3", "set1", "set2")
Street = c("123 St", "234 St", "543 St", "492 st", "231 st", "492 st", "231 st")
State = c("al", "nc", "fl", "ca", "md", "tx", "vt")
df = data.frame(ID, Name, Set, Street, State)
我用pivot_wider改造了一下,感觉和我想要的不一样。由于实际数据每个条目有 20 个集合,每个集合有 7 个变量,有没有一种简单的方法可以在重塑时做到这一点?
看起来像这样:
test <- pivot_wider(df, names_from = c("Set"), values_from = c("Street", "State"))
test
# A tibble: 3 x 8
ID Name Street_set1 Street_set2 Street_set3 State_set1 State_set2 State_set3
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 A mary 123 St 234 St NA al nc NA
2 B berry 543 St 492 st 231 st fl ca md
3 C paul 492 st 231 st NA tx vt NA
但我想要的是它看起来像这样:
ID Name Set1_Street Set1_State Set2_Street Set2_State Set3_Street Set3State
1 A mary 123 St al 234 St nc <NA> <NA>
2 B berry 543 St fl 492 st fl 231 st md
3 C paul 492 st tx 231 st vt <NA> <NA>
如果您对此有想法,我也非常喜欢您对哪个选项(重塑、传播)更适合大型数据集的意见!
编辑:遗漏了我使用的 pivot_wider 命令,已修复!哎呀
在 pivot_wider
中使用 names_glue
可能会更容易
library(dplyr)
library(tidyr)
df %>%
pivot_wider(names_from = Set, values_from = c(Street, State),
names_glue = "{tools::toTitleCase(Set)}_{.value}") %>%
dplyr::select(ID, Name, order(readr::parse_number(names(.)[-(1:2)])) + 2)
-输出
# A tibble: 3 × 8
ID Name Set1_Street Set1_State Set2_Street Set2_State Set3_Street Set3_State
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 A mary 123 St al 234 St nc <NA> <NA>
2 B berry 543 St fl 492 st ca 231 st md
3 C paul 492 st tx 231 st vt <NA> <NA>
按照显示的建议 here in GitHub issue #839, FR: order of columns resulting from pivot_wider,您可以通过手动生成“规范”来解决这些问题。这对我来说听起来比实际更难。
这是您的数据的样子。您首先使用 build_wider_spec()
定义规范,然后按照您想要的列名顺序放置它(使用 arrange()
或类似的东西)。在您的情况下,您想按“设置”订购。您可以看到我输入了 names_glue
来更改列名,但这一步不是必需的。
完成后,使用您创建的 spec
对象在您的数据集上使用 pivot_wider()
。
library(tidyr)
library(dplyr)
spec <- build_wider_spec(df, names_from = "Set", values_from = c("Street", "State"),
names_glue = "{Set}_{.value}")
spec <- arrange(spec, Set,)
spec
#> # A tibble: 6 x 3
#> .name .value Set
#> <chr> <chr> <chr>
#> 1 set1_Street Street set1
#> 2 set1_State State set1
#> 3 set2_Street Street set2
#> 4 set2_State State set2
#> 5 set3_Street Street set3
#> 6 set3_State State set3
pivot_wider_spec(df, spec)
#> # A tibble: 3 x 8
#> ID Name set1_Street set1_State set2_Street set2_State set3_Street
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 A mary 123 St al 234 St nc <NA>
#> 2 B berry 543 St fl 492 st ca 231 st
#> 3 C paul 492 st tx 231 st vt <NA>
#> # ... with 1 more variable: set3_State <chr>
由 reprex package (v2.0.0)
于 2021-12-17 创建
由于问题839 is finally resolved with the advent of tidyr,你可以直接这样做
library(tidyr)
pivot_wider(df, names_from = c("Set"), values_from = c("Street", "State"), names_vary = 'slowest')
#> # A tibble: 3 x 8
#> ID Name Street_set1 State_set1 Street_set2 State_set2 Street_set3
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 A mary 123 St al 234 St nc <NA>
#> 2 B berry 543 St fl 492 st ca 231 st
#> 3 C paul 492 st tx 231 st vt <NA>
#> # ... with 1 more variable: State_set3 <chr>
数据
ID = c("A", "A", "B", "B", "B", "C", "C")
Name = c("mary", "mary", "berry", "berry", "berry", "paul", "paul")
Set = c("set1", "set2", "set1", "set2", "set3", "set1", "set2")
Street = c("123 St", "234 St", "543 St", "492 st", "231 st", "492 st", "231 st")
State = c("al", "nc", "fl", "ca", "md", "tx", "vt")
df = data.frame(ID, Name, Set, Street, State)
由 reprex package (v2.0.1)
于 2022-02-18 创建
我需要将一个长数据集(下面的 df)重塑为宽数据集,其中多个变量在给定 ID 的长条目中是相同的,而其他变量则逐行更改。虚拟数据如下:
ID = c("A", "A", "B", "B", "B", "C", "C")
Name = c("mary", "mary", "berry", "berry", "berry", "paul", "paul")
Set = c("set1", "set2", "set1", "set2", "set3", "set1", "set2")
Street = c("123 St", "234 St", "543 St", "492 st", "231 st", "492 st", "231 st")
State = c("al", "nc", "fl", "ca", "md", "tx", "vt")
df = data.frame(ID, Name, Set, Street, State)
我用pivot_wider改造了一下,感觉和我想要的不一样。由于实际数据每个条目有 20 个集合,每个集合有 7 个变量,有没有一种简单的方法可以在重塑时做到这一点?
看起来像这样:
test <- pivot_wider(df, names_from = c("Set"), values_from = c("Street", "State"))
test
# A tibble: 3 x 8
ID Name Street_set1 Street_set2 Street_set3 State_set1 State_set2 State_set3
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 A mary 123 St 234 St NA al nc NA
2 B berry 543 St 492 st 231 st fl ca md
3 C paul 492 st 231 st NA tx vt NA
但我想要的是它看起来像这样:
ID Name Set1_Street Set1_State Set2_Street Set2_State Set3_Street Set3State
1 A mary 123 St al 234 St nc <NA> <NA>
2 B berry 543 St fl 492 st fl 231 st md
3 C paul 492 st tx 231 st vt <NA> <NA>
如果您对此有想法,我也非常喜欢您对哪个选项(重塑、传播)更适合大型数据集的意见!
编辑:遗漏了我使用的 pivot_wider 命令,已修复!哎呀
在 pivot_wider
names_glue
可能会更容易
library(dplyr)
library(tidyr)
df %>%
pivot_wider(names_from = Set, values_from = c(Street, State),
names_glue = "{tools::toTitleCase(Set)}_{.value}") %>%
dplyr::select(ID, Name, order(readr::parse_number(names(.)[-(1:2)])) + 2)
-输出
# A tibble: 3 × 8
ID Name Set1_Street Set1_State Set2_Street Set2_State Set3_Street Set3_State
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 A mary 123 St al 234 St nc <NA> <NA>
2 B berry 543 St fl 492 st ca 231 st md
3 C paul 492 st tx 231 st vt <NA> <NA>
按照显示的建议 here in GitHub issue #839, FR: order of columns resulting from pivot_wider,您可以通过手动生成“规范”来解决这些问题。这对我来说听起来比实际更难。
这是您的数据的样子。您首先使用 build_wider_spec()
定义规范,然后按照您想要的列名顺序放置它(使用 arrange()
或类似的东西)。在您的情况下,您想按“设置”订购。您可以看到我输入了 names_glue
来更改列名,但这一步不是必需的。
完成后,使用您创建的 spec
对象在您的数据集上使用 pivot_wider()
。
library(tidyr)
library(dplyr)
spec <- build_wider_spec(df, names_from = "Set", values_from = c("Street", "State"),
names_glue = "{Set}_{.value}")
spec <- arrange(spec, Set,)
spec
#> # A tibble: 6 x 3
#> .name .value Set
#> <chr> <chr> <chr>
#> 1 set1_Street Street set1
#> 2 set1_State State set1
#> 3 set2_Street Street set2
#> 4 set2_State State set2
#> 5 set3_Street Street set3
#> 6 set3_State State set3
pivot_wider_spec(df, spec)
#> # A tibble: 3 x 8
#> ID Name set1_Street set1_State set2_Street set2_State set3_Street
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 A mary 123 St al 234 St nc <NA>
#> 2 B berry 543 St fl 492 st ca 231 st
#> 3 C paul 492 st tx 231 st vt <NA>
#> # ... with 1 more variable: set3_State <chr>
由 reprex package (v2.0.0)
于 2021-12-17 创建由于问题839 is finally resolved with the advent of tidyr,你可以直接这样做
library(tidyr)
pivot_wider(df, names_from = c("Set"), values_from = c("Street", "State"), names_vary = 'slowest')
#> # A tibble: 3 x 8
#> ID Name Street_set1 State_set1 Street_set2 State_set2 Street_set3
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 A mary 123 St al 234 St nc <NA>
#> 2 B berry 543 St fl 492 st ca 231 st
#> 3 C paul 492 st tx 231 st vt <NA>
#> # ... with 1 more variable: State_set3 <chr>
数据
ID = c("A", "A", "B", "B", "B", "C", "C")
Name = c("mary", "mary", "berry", "berry", "berry", "paul", "paul")
Set = c("set1", "set2", "set1", "set2", "set3", "set1", "set2")
Street = c("123 St", "234 St", "543 St", "492 st", "231 st", "492 st", "231 st")
State = c("al", "nc", "fl", "ca", "md", "tx", "vt")
df = data.frame(ID, Name, Set, Street, State)
由 reprex package (v2.0.1)
于 2022-02-18 创建