一次旋转多组列
pivot multiple sets of columns at once
我有一个数据框,我在其中回归了一堆数据集,然后取出这些数据集的子集并再次回归。这导致了一个数据框,其中的列显示了“完整”数据集的斜率和截距以及标准误差,然后更多的列显示了“子集”数据集的这些内容。
我想将数据集转换为长格式,其中一列显示它是哪种类型(完整或子集),然后其他列显示斜率、se、截距等。
我想出了一种方法,方法是执行 pivot_longer
多次不同的操作,然后过滤到数据透视表创建的新列匹配的位置,但这不是执行此操作的最佳方法。我想知道是否有一种方法可以在 pivot 函数中列出列集以跳过这一大块代码。下面的代表。
# made dataframe
df <-
tribble(
~trial, ~full_slope, ~full_slope_se, ~subset_slope, ~subset_slope_se,
1, 10, 1, 12, 2.5,
2, 9, 1.2, 8.5, 3,
3, 9.5, 2, 9.9, 3
)
# pivot
df %>%
# first pivot the slope columns
pivot_longer(cols = c(full_slope, subset_slope),
names_to = "type",
values_to = "slope") %>%
# next pivot the SE columns
pivot_longer(cols = c(full_slope_se, subset_slope_se),
names_to = "type_se",
values_to = "se") %>%
# add a column for when they match up (slope and se both from same dataset, full or subset)
mutate(
data_type =
case_when(
type == "full_slope" & type_se == "full_slope_se" ~ "full",
type == "subset_slope" & type_se == "subset_slope_se" ~ "subset"
)) %>%
# remove rows that are musmatched
filter(!is.na(data_type)) %>%
# remove extra columns made by pivots
select(-type, -type_se) %>%
relocate(data_type, .after = trial)
这确实给了我想要的输出,我只是觉得这不是我应该做的。提前致谢!
使用names_pattern=
:
df %>%
pivot_longer(-trial, names_pattern = "([^_]*)_(.*)", names_to = c("data_table", ".value"))
# # A tibble: 6 x 4
# trial data_table slope slope_se
# <dbl> <chr> <dbl> <dbl>
# 1 1 full 10 1
# 2 1 subset 12 2.5
# 3 2 full 9 1.2
# 4 2 subset 8.5 3
# 5 3 full 9.5 2
# 6 3 subset 9.9 3
我有一个数据框,我在其中回归了一堆数据集,然后取出这些数据集的子集并再次回归。这导致了一个数据框,其中的列显示了“完整”数据集的斜率和截距以及标准误差,然后更多的列显示了“子集”数据集的这些内容。
我想将数据集转换为长格式,其中一列显示它是哪种类型(完整或子集),然后其他列显示斜率、se、截距等。
我想出了一种方法,方法是执行 pivot_longer
多次不同的操作,然后过滤到数据透视表创建的新列匹配的位置,但这不是执行此操作的最佳方法。我想知道是否有一种方法可以在 pivot 函数中列出列集以跳过这一大块代码。下面的代表。
# made dataframe
df <-
tribble(
~trial, ~full_slope, ~full_slope_se, ~subset_slope, ~subset_slope_se,
1, 10, 1, 12, 2.5,
2, 9, 1.2, 8.5, 3,
3, 9.5, 2, 9.9, 3
)
# pivot
df %>%
# first pivot the slope columns
pivot_longer(cols = c(full_slope, subset_slope),
names_to = "type",
values_to = "slope") %>%
# next pivot the SE columns
pivot_longer(cols = c(full_slope_se, subset_slope_se),
names_to = "type_se",
values_to = "se") %>%
# add a column for when they match up (slope and se both from same dataset, full or subset)
mutate(
data_type =
case_when(
type == "full_slope" & type_se == "full_slope_se" ~ "full",
type == "subset_slope" & type_se == "subset_slope_se" ~ "subset"
)) %>%
# remove rows that are musmatched
filter(!is.na(data_type)) %>%
# remove extra columns made by pivots
select(-type, -type_se) %>%
relocate(data_type, .after = trial)
这确实给了我想要的输出,我只是觉得这不是我应该做的。提前致谢!
使用names_pattern=
:
df %>%
pivot_longer(-trial, names_pattern = "([^_]*)_(.*)", names_to = c("data_table", ".value"))
# # A tibble: 6 x 4
# trial data_table slope slope_se
# <dbl> <chr> <dbl> <dbl>
# 1 1 full 10 1
# 2 1 subset 12 2.5
# 3 2 full 9 1.2
# 4 2 subset 8.5 3
# 5 3 full 9.5 2
# 6 3 subset 9.9 3