Flattening/widening 一个数据集,用于在一行中显示单个分析物的多次试验

Flattening/widening a dataset to show multiple trials of a single analyte in one row

我正在尝试 flatten/widen 我的数据框按样本名称排序。每个样本都完成了多次试验,我想将所有试验安排在一行中。

示例数据:

Sample_Name <- c("M1","M1","M1","M1","M2","M2","M2","M2")
test_ID <- c("Gen1 Spec1", "Gen2 Spec2", "Gen2 Spec2", "Gen2 Spec2", "Gen3 Spec3", "Gen3 Spec3", "Gen4 Spec4", "Gen4 Spec4")
MScore <- c(2.2, 1.9, 2.1, 2.0, 1.0, 2.0, 1.4, 1.5)
Test_Data <-data.frame(Sample_Name, test_ID, MScore)

我想要的输出形状:

Target_Sample_Name <-c("M1","M2")
Trial_1_ID <-c("Gen1 Spec1", "Gen2 Spec2")
Trial_1_Score <-c(2.2, 1.0)
Trial_2_ID<-c("Gen2 Spec2", "Gen3 Spec3")
Trial_2_Score<-c(1.9, 2.0)
Trial_3_ID<-c("Gen2 Spec2", "Gen4 Spec4" )
Trial_3_Score<-c(2.1 , 1.4)
Trial_4_ID<-c("Gen2 Spec2","Gen4 Spec4" )
Trial_4_Score<-c(2.0, 1.5 )
Desired_Output <- data.frame(Target_Sample_Name, Trial_1_ID, Trial_1_Score, Trial_2_ID, Trial_2_Score, Trial_3_ID, Trial_3_Score, Trial_4_ID, Trial_4_Score)

我确信有更好的方式来实际展示我想做的事情,但我是超级新手,还没有找到它。

我尝试过使用聚合,但不知道使用什么 FUN。我也试过使用 tibble pivot_wider 函数,但我无法让它工作。我知道这是一种组织数据的奇怪方式,但我保证它在我的项目上下文中是有意义的!

谢谢!

你可以使用

library(dplyr)
library(tidyr)

Test_Data %>% 
  group_by(Sample_Name) %>% 
  mutate(rn = row_number()) %>% 
  pivot_wider(id_cols = Sample_Name,
              names_from = rn,
              names_glue = "{.value}_{rn}",
              values_from = c("test_ID", "MScore")) %>% 
  rename_with(~gsub("test_ID_(\d+)", "Trail_\1_ID", .x), starts_with("test_ID")) %>% 
  rename_with(~gsub("MScore_(\d+)", "Trail_\1_Score", .x), starts_with("MScore")) %>% 
  select(colnames(.)[order(colnames(.))]) %>%
  ungroup()

这个returns

# A tibble: 2 x 9
  Sample_Name Trail_1_ID Trail_1_Score Trail_2_ID Trail_2_Score Trail_3_ID Trail_3_Score
  <chr>       <chr>              <dbl> <chr>              <dbl> <chr>              <dbl>
1 M1          Gen1 Spec1           2.2 Gen2 Spec2           1.9 Gen2 Spec2           2.1
2 M2          Gen3 Spec3           1   Gen3 Spec3           2   Gen4 Spec4           1.4
# ... with 2 more variables: Trail_4_ID <chr>, Trail_4_Score <dbl>