R每隔一列重塑数据框

R reshape dataframe every other column

我很难整理一些以奇怪方式获取的数据...它有一些患者标识符,然后是带有测试日期的列,然后是带有相应测量值的列。但它随着时间的推移重复进行相同的测试,并且数据在后续列中。

数据框是这样的:

df1 <- data.frame(id = c("A","B"),
                 test1 = c("10-12-16", "12-10-17"),
                 test1_result = c("20", "3"),
                 test2 = c("10-01-17", "11-12-17"),
                 test2_result = c("18", "4"),
                 test3 = c("12-03-18", "NA"),
                 test3_result = c("300", "NA"))

我想获得这样的东西:

df2 <- data.frame(id = c("A", "A", "A", "B", "B", "B"),
                 tests = c("10-12-16", "10-01-17", "12-03-18", "12-10-17", "11-12-17", "NA"),
                 results = c("20", "18", "300", "3", "4", "NA")
                 )

我想不出转换它的方法,任何帮助将不胜感激。

谢谢!

您可以尝试 melt 来自 data.table:

library(data.table)
setDT(df1)

df2 <- melt(df1, id = 'id', measure = patterns('test\d$', '_result'))[
    , .(id, tests = value1, results = value2)]

#    id    tests results
# 1:  A 10-12-16      20
# 2:  B 12-10-17       3
# 3:  A 10-01-17      18
# 4:  B 11-12-17       4
# 5:  A 12-03-18     300
# 6:  B       NA      NA

这里有一个可能性,使用 dplyr:

library(tidyverse);
df1 %>% 
    gather(k1, results, contains("_result")) %>% 
    mutate(k1 = gsub("_result", "", k1)) %>% 
    gather(k2, tests, contains("test")) %>% 
    filter(k1 == k2) %>% 
    select(id, tests, results)
#  id    tests results
#1  A 10-12-16      20
#2  B 12-10-17       3
#3  A 10-01-17      18
#4  B 11-12-17       4
#5  A 12-03-18     300
#6  B       NA      NA