如何pivot_longer一组多列?以及如何从那种长格式返回到原来的宽格式?
How to pivot_longer a set of multiple columns? and How to go back from that long format to original wide?
如果我有以下数据:
D = tibble::tribble(
~firm, ~ind, ~var1_1, ~var1_2, ~op2_1, ~op2_2,
"A", 1, 10, 11, 11, 12,
"A", 2, 12, 13, 13, 14,
"B", 1, 14, 15, 15, 16,
"B", 2, 16, 17, 17, 18,
"C", 1, 18, 19, 19, 20,
"C", 2, 20, 21, 21, 22,
)
如何 pivot_longer()
var1 和 var2 将“_*”作为年份指示符?
我的意思是,我想要这样的东西:
D %>%
pivot_longer(var1_1:op2_2,
names_to = c(".value", "year"),
names_pattern = "(.*)_(.*)",
values_to = c("var1, var2")
)
# A tibble: 12 x 5
firm ind year var1 op2
<chr> <dbl> <chr> <dbl> <dbl>
1 A 1 1 10 11
2 A 1 2 11 12
3 A 2 1 12 13
4 A 2 2 13 14
5 B 1 1 14 15
6 B 1 2 15 16
7 B 2 1 16 17
8 B 2 2 17 18
9 C 1 1 18 19
10 C 1 2 19 20
11 C 2 1 20 21
12 C 2 2 21 22
我使用上面的代码实现了预期的结果。然而,在我的真实案例中,我要处理 30 多个变量和 10 年。然后,使用 values_to
既不实用也不干净。我希望代码读取变量名的第一部分作为所需的新变量名。由于最初要旋转的所有列的结构类似于 "varname_year".
Besides, once I get the new data format into long, I might need to go back to wide-format keeping the initial data structure.
我们可以使用其中一个 select_helpers
library(dplyr)
library(tidyr)
library(stringr)
Dlong <- D %>%
pivot_longer(cols = starts_with('var'),
names_to = c(".value", "year"), names_sep = "_")
从 'long' 格式,更改为 'wide' 和 pivot_wider
Dlong %>%
pivot_wider(names_from = ind, values_from = str_c("var", 1:2))
如果我有以下数据:
D = tibble::tribble(
~firm, ~ind, ~var1_1, ~var1_2, ~op2_1, ~op2_2,
"A", 1, 10, 11, 11, 12,
"A", 2, 12, 13, 13, 14,
"B", 1, 14, 15, 15, 16,
"B", 2, 16, 17, 17, 18,
"C", 1, 18, 19, 19, 20,
"C", 2, 20, 21, 21, 22,
)
如何 pivot_longer()
var1 和 var2 将“_*”作为年份指示符?
我的意思是,我想要这样的东西:
D %>%
pivot_longer(var1_1:op2_2,
names_to = c(".value", "year"),
names_pattern = "(.*)_(.*)",
values_to = c("var1, var2")
)
# A tibble: 12 x 5
firm ind year var1 op2
<chr> <dbl> <chr> <dbl> <dbl>
1 A 1 1 10 11
2 A 1 2 11 12
3 A 2 1 12 13
4 A 2 2 13 14
5 B 1 1 14 15
6 B 1 2 15 16
7 B 2 1 16 17
8 B 2 2 17 18
9 C 1 1 18 19
10 C 1 2 19 20
11 C 2 1 20 21
12 C 2 2 21 22
我使用上面的代码实现了预期的结果。然而,在我的真实案例中,我要处理 30 多个变量和 10 年。然后,使用 values_to
既不实用也不干净。我希望代码读取变量名的第一部分作为所需的新变量名。由于最初要旋转的所有列的结构类似于 "varname_year".
Besides, once I get the new data format into long, I might need to go back to wide-format keeping the initial data structure.
我们可以使用其中一个 select_helpers
library(dplyr)
library(tidyr)
library(stringr)
Dlong <- D %>%
pivot_longer(cols = starts_with('var'),
names_to = c(".value", "year"), names_sep = "_")
从 'long' 格式,更改为 'wide' 和 pivot_wider
Dlong %>%
pivot_wider(names_from = ind, values_from = str_c("var", 1:2))