如何重新格式化数据，使 ID 对应两行，一行包含示例源，第二行包含源的结果

Question

我正在处理一个包含约 2500 个唯一 ID 的临床数据集。一些 ID 对应于 20 多次出现。我希望看到样本类型（NP、喉咙等）以及“未检测到”或“检测到”的测试结果，但我希望看到它们分布在多个列中并且 ID 为基本上是两排。第一行是每次出现的所有样本类型，然后第二行是每次出现的结果。我可以得到第一行没问题，但我一直无法弄清楚如何在同一 ID 上添加第二行，结果低于相应的样本类型。任何帮助将不胜感激！

ID <- c(1,1,2,2,3,3,3,4)
Type<-c("EM","EM","PA","PA","PA","PA","PA","EM")
Specimen_Type <- c("NP", "NP", "Throat", "Throat", "NP", "Throat", "Throat", "NP")
RESULT_VAL <- c("Not Detected", "Detected", "Not Detected", "Detected", "Not Detected", "Not Detected", "Detected", "Not Detected")
RESULT_DATE <- c("6-1-2020", "6-10-2020","6-1-2020", "6-10-2020","6-1-2020", "6-10-2020", "6-20-2020", "6-1-2020")
Data_sum<- data.frame(ID, Type, Specimen_Type, RESULT_VAL, RESULT_DATE)

我希望它看起来像

ID    Type     Occurrence_1          Occurrence_2         Occurrence_3
1      EM        NP                    NP
1      EM        Not Detected          Detected
2      PA        Throat                Throat
2      PA        Not Detected          Detected
3      PA        NP                    Throat               Throat
3      PA        Not Detected          Not Detected         Detected
4      EM        NP
4      EM        Not Detected

Answer 1

我们可以重新整形为 'long'，然后再整形为 'wide'

library(dplyr)
library(stringr)
library(tidyr)
library(data.table)
Data_sum %>% 
    pivot_longer(cols = c(Specimen_Type, RESULT_VAL)) %>%
    arrange(ID, Type, 
        factor(name, levels = c('Specimen_Type', 'RESULT_VAL'))) %>% 
    mutate(rn = str_c('Occurence_', rowid(ID, Type, name))) %>% 
   select(-RESULT_DATE) %>% 
   pivot_wider(names_from = rn, values_from = value) %>%    
   select(-name)
# A tibble: 8 x 5
#     ID Type  Occurence_1  Occurence_2  Occurence_3
#   <dbl> <chr> <chr>        <chr>        <chr>      
#1     1 EM    NP           NP           <NA>       
#2     1 EM    Not Detected Detected     <NA>       
#3     2 PA    Throat       Throat       <NA>       
#4     2 PA    Not Detected Detected     <NA>       
#5     3 PA    NP           Throat       Throat     
#6     3 PA    Not Detected Not Detected Detected   
#7     4 EM    NP           <NA>         <NA>       
#8     4 EM    Not Detected <NA>         <NA>

如何重新格式化数据，使 ID 对应两行，一行包含示例源，第二行包含源的结果

How to reformat data so ID corresponds to two rows, one row contains sample source, the second row contains the result for the source

r

reshape

spread