如何在不组合名称的情况下“pivot_wider”多列?
How to `pivot_wider` multiple columns without combining the names?
我想通过组合多列而不组合名称来将 pivot_wider
应用于 tidyverse tibble。我的数据包含有关患者用药的信息。给定患者可能服用或不服用多种药物。这些列包含任意顺序的药物名称和每日剂量。
问题是我目前的方法产生了太多的列,因为药物名称是组合在一起的。请参阅此可重现的示例:
library(tidyverse)
# Let's have data of patients who may or may not take some drugs with a
# corresponding dose:
(
medication <- tribble(
~dob, ~drug_1, ~dose_1, ~drug_2, ~dose_2, ~drug_3, ~dose_3,
"1970-01-01", "Drug A", 100, NA, NA, NA, NA,
"1980-01-01", "Drug B", 150, "Drug A", 200, NA, NA,
"1990-01-01", NA, NA, "Drug C", 500, "Drug B", 100
)
)
# The desired arrangement is as follows:
#
# dob | 'Drug A' | 'Drug B' | 'Drug C'
# -----------|-------------------------------
# 1970-01-01 | 100 | NA | NA
# 1980-01-01 | 200 | 150 | NA
# 1980-01-01 | NA | 100 | 500
# The following attempt to pivot wider creates too many columns by combining all the drug names:
medication %>%
pivot_wider(names_from = starts_with("drug_"),
values_from = starts_with("dose_"))
# # A tibble: 3 × 10
# dob `dose_1_Drug A_NA_NA` `dose_1_Drug B_Drug A_NA` `dose_1_NA_Drug C_Drug B` `dose_2_Drug A_NA_NA` `dose_2_Drug B_Drug A_NA` `dose_2_NA_Drug C_Drug B` `dose_3_Drug A_NA_NA` `dose_3_Drug B_Drug A_NA` `dose_3_NA_Drug C_Drug B`
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 1970-01-01 100 NA NA NA NA NA NA NA NA
# 2 1980-01-01 NA 150 NA NA 200 NA NA NA NA
# 3 1990-01-01 NA NA NA NA NA 500 NA NA 100
我什至尝试多次应用 pivot_wider
函数,每次都明确说明 names_from = "drug_1"
等等,但这会导致“列已存在”错误。
有没有办法实现如上所示的所需排列?谢谢。
两步,先旋转更长的时间:
medication %>%
pivot_longer(
cols = -dob,
names_to = c(".value", NA),
names_sep = "_",
values_drop_na = TRUE
) %>%
pivot_wider(
names_from = drug,
values_from = dose
)
# A tibble: 3 x 4
dob `Drug A` `Drug B` `Drug C`
<chr> <dbl> <dbl> <dbl>
1 1970-01-01 100 NA NA
2 1980-01-01 200 150 NA
3 1990-01-01 NA 100 500
我想通过组合多列而不组合名称来将 pivot_wider
应用于 tidyverse tibble。我的数据包含有关患者用药的信息。给定患者可能服用或不服用多种药物。这些列包含任意顺序的药物名称和每日剂量。
问题是我目前的方法产生了太多的列,因为药物名称是组合在一起的。请参阅此可重现的示例:
library(tidyverse)
# Let's have data of patients who may or may not take some drugs with a
# corresponding dose:
(
medication <- tribble(
~dob, ~drug_1, ~dose_1, ~drug_2, ~dose_2, ~drug_3, ~dose_3,
"1970-01-01", "Drug A", 100, NA, NA, NA, NA,
"1980-01-01", "Drug B", 150, "Drug A", 200, NA, NA,
"1990-01-01", NA, NA, "Drug C", 500, "Drug B", 100
)
)
# The desired arrangement is as follows:
#
# dob | 'Drug A' | 'Drug B' | 'Drug C'
# -----------|-------------------------------
# 1970-01-01 | 100 | NA | NA
# 1980-01-01 | 200 | 150 | NA
# 1980-01-01 | NA | 100 | 500
# The following attempt to pivot wider creates too many columns by combining all the drug names:
medication %>%
pivot_wider(names_from = starts_with("drug_"),
values_from = starts_with("dose_"))
# # A tibble: 3 × 10
# dob `dose_1_Drug A_NA_NA` `dose_1_Drug B_Drug A_NA` `dose_1_NA_Drug C_Drug B` `dose_2_Drug A_NA_NA` `dose_2_Drug B_Drug A_NA` `dose_2_NA_Drug C_Drug B` `dose_3_Drug A_NA_NA` `dose_3_Drug B_Drug A_NA` `dose_3_NA_Drug C_Drug B`
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 1970-01-01 100 NA NA NA NA NA NA NA NA
# 2 1980-01-01 NA 150 NA NA 200 NA NA NA NA
# 3 1990-01-01 NA NA NA NA NA 500 NA NA 100
我什至尝试多次应用 pivot_wider
函数,每次都明确说明 names_from = "drug_1"
等等,但这会导致“列已存在”错误。
有没有办法实现如上所示的所需排列?谢谢。
两步,先旋转更长的时间:
medication %>%
pivot_longer(
cols = -dob,
names_to = c(".value", NA),
names_sep = "_",
values_drop_na = TRUE
) %>%
pivot_wider(
names_from = drug,
values_from = dose
)
# A tibble: 3 x 4
dob `Drug A` `Drug B` `Drug C`
<chr> <dbl> <dbl> <dbl>
1 1970-01-01 100 NA NA
2 1980-01-01 200 150 NA
3 1990-01-01 NA 100 500