根据列名的一部分将宽重塑为长
reshape wide to long based on part of column name
有没有办法根据列名的第一部分在 R 中将宽变长?我有以下数据:
id | Jan_shoulder | Jan_head | Jan_knee | Feb_shoulder | Feb_head | Feb_knee
1 | yes | no | yes | no | no | no
2 | no | no | no | yes | yes | no
而且我想转换一下,让每一行对应一个唯一的id和月份,比如:
id | month | shoulder | head | knee
1 | Jan | yes | no | yes
1 | Feb | no | no | no
2 | Jan | no | no | no
2 | Feb | yes | yes | no
使用dplyr
和tidyr
,我们可以gather
数据为长格式,separate
列名为不同的列,spread
它们为宽格式格式。
library(dplyr)
library(tidyr)
df %>%
gather(key, value, -id) %>%
separate(key, into = c("month", "part"), sep = "_") %>%
spread(part, value)
# id month head knee shoulder
#1 1 Feb no no no
#2 1 Jan no yes yes
#3 2 Feb yes no yes
#4 2 Jan no no no
开发版tidyr
可以直接用pivot_longer
改成'long'格式
library(dplyr)
library(tidyr) # ‘0.8.3.9000’
library(stringr)
df1 %>%
rename_at(-1, ~ str_replace(., "(\w+)_(\w+)", "\2_\1")) %>%
pivot_longer(-id, names_to = c(".value", "month"), names_sep='_')
# A tibble: 4 x 5
# id month shoulder head knee
# <int> <chr> <chr> <chr> <chr>
#1 1 Jan yes no yes
#2 1 Feb no no no
#3 2 Jan no no no
#4 2 Feb yes yes no
或 melt
来自 data.table
library(data.table)
name1 <- unique(sub("_.*", "", names(df1)[-1]))
melt(setDT(df1), measure = patterns("head", "shoulder", "knee"),
value.name = c("head", "shoulder", "knee"),
variable.name = "month")[, month := name1[month]][]
# id month head shoulder knee
#1: 1 Jan no yes yes
#2: 2 Jan no no no
#3: 1 Feb no no no
#4: 2 Feb yes yes no
或在 base R
中与 reshape
reshape(df1, direction = 'long', idvar = 'id',
varying = list(c(2, 5), c(3, 6), c(4, 7)))
数据
df1 <- structure(list(id = 1:2, Jan_shoulder = c("yes", "no"), Jan_head = c("no",
"no"), Jan_knee = c("yes", "no"), Feb_shoulder = c("no", "yes"
), Feb_head = c("no", "yes"), Feb_knee = c("no", "no")),
class = "data.frame", row.names = c(NA,
-2L))
有没有办法根据列名的第一部分在 R 中将宽变长?我有以下数据:
id | Jan_shoulder | Jan_head | Jan_knee | Feb_shoulder | Feb_head | Feb_knee
1 | yes | no | yes | no | no | no
2 | no | no | no | yes | yes | no
而且我想转换一下,让每一行对应一个唯一的id和月份,比如:
id | month | shoulder | head | knee
1 | Jan | yes | no | yes
1 | Feb | no | no | no
2 | Jan | no | no | no
2 | Feb | yes | yes | no
使用dplyr
和tidyr
,我们可以gather
数据为长格式,separate
列名为不同的列,spread
它们为宽格式格式。
library(dplyr)
library(tidyr)
df %>%
gather(key, value, -id) %>%
separate(key, into = c("month", "part"), sep = "_") %>%
spread(part, value)
# id month head knee shoulder
#1 1 Feb no no no
#2 1 Jan no yes yes
#3 2 Feb yes no yes
#4 2 Jan no no no
开发版tidyr
pivot_longer
改成'long'格式
library(dplyr)
library(tidyr) # ‘0.8.3.9000’
library(stringr)
df1 %>%
rename_at(-1, ~ str_replace(., "(\w+)_(\w+)", "\2_\1")) %>%
pivot_longer(-id, names_to = c(".value", "month"), names_sep='_')
# A tibble: 4 x 5
# id month shoulder head knee
# <int> <chr> <chr> <chr> <chr>
#1 1 Jan yes no yes
#2 1 Feb no no no
#3 2 Jan no no no
#4 2 Feb yes yes no
或 melt
来自 data.table
library(data.table)
name1 <- unique(sub("_.*", "", names(df1)[-1]))
melt(setDT(df1), measure = patterns("head", "shoulder", "knee"),
value.name = c("head", "shoulder", "knee"),
variable.name = "month")[, month := name1[month]][]
# id month head shoulder knee
#1: 1 Jan no yes yes
#2: 2 Jan no no no
#3: 1 Feb no no no
#4: 2 Feb yes yes no
或在 base R
中与 reshape
reshape(df1, direction = 'long', idvar = 'id',
varying = list(c(2, 5), c(3, 6), c(4, 7)))
数据
df1 <- structure(list(id = 1:2, Jan_shoulder = c("yes", "no"), Jan_head = c("no",
"no"), Jan_knee = c("yes", "no"), Feb_shoulder = c("no", "yes"
), Feb_head = c("no", "yes"), Feb_knee = c("no", "no")),
class = "data.frame", row.names = c(NA,
-2L))