根据列名的一部分将宽重塑为长

Question

有没有办法根据列名的第一部分在 R 中将宽变长？我有以下数据：

id |  Jan_shoulder | Jan_head | Jan_knee | Feb_shoulder | Feb_head | Feb_knee
1  |     yes       |    no    |    yes   |    no        |   no     |  no
2  |     no        |    no    |    no    |    yes       |   yes    |  no

而且我想转换一下，让每一行对应一个唯一的id和月份，比如：

id |  month | shoulder | head | knee 
1  |  Jan   |    yes   |  no  |  yes
1  |  Feb   |    no    |  no  |  no
2  |  Jan   |    no    |  no  |  no
2  |  Feb   |    yes   |  yes |  no

Answer 1

使用dplyr和tidyr，我们可以gather数据为长格式，separate列名为不同的列，spread它们为宽格式格式。

library(dplyr)
library(tidyr)

df %>%
  gather(key, value, -id) %>%
  separate(key, into = c("month", "part"), sep = "_") %>%
  spread(part, value)

#  id month head knee shoulder
#1  1   Feb   no   no       no
#2  1   Jan   no  yes      yes
#3  2   Feb  yes   no      yes
#4  2   Jan   no   no       no

Answer 2

开发版tidyr

可以直接用pivot_longer改成'long'格式

library(dplyr)
library(tidyr) # ‘0.8.3.9000’
library(stringr)
df1 %>%
    rename_at(-1, ~ str_replace(., "(\w+)_(\w+)", "\2_\1")) %>% 
    pivot_longer(-id, names_to = c(".value", "month"), names_sep='_')
# A tibble: 4 x 5
#     id month shoulder head  knee 
#  <int> <chr> <chr>    <chr> <chr>
#1     1 Jan   yes      no    yes  
#2     1 Feb   no       no    no   
#3     2 Jan   no       no    no   
#4     2 Feb   yes      yes   no

或 melt 来自 data.table

library(data.table)
name1 <- unique(sub("_.*", "", names(df1)[-1]))
melt(setDT(df1), measure = patterns("head", "shoulder", "knee"), 
        value.name = c("head", "shoulder", "knee"),
        variable.name = "month")[, month := name1[month]][]
#   id month head shoulder knee
#1:  1   Jan   no      yes  yes
#2:  2   Jan   no       no   no
#3:  1   Feb   no       no   no
#4:  2   Feb  yes      yes   no

或在 base R 中与 reshape

reshape(df1, direction = 'long', idvar = 'id', 
        varying = list(c(2, 5), c(3, 6), c(4, 7)))

数据

df1 <- structure(list(id = 1:2, Jan_shoulder = c("yes", "no"), Jan_head = c("no", 
"no"), Jan_knee = c("yes", "no"), Feb_shoulder = c("no", "yes"
), Feb_head = c("no", "yes"), Feb_knee = c("no", "no")),
  class = "data.frame", row.names = c(NA, 
-2L))

根据列名的一部分将宽重塑为长

reshape wide to long based on part of column name

r

reshape

数据