在 R 中从长格式重塑为宽格式,变量重命名问题

Reshaping from long to wide format in R, problem with variables re-naming

我有一个长格式的数据集,我想将其整形为宽格式。我通常知道该怎么做,但问题出在变量名上。长格式的变量如下所示:

ID time WSAS_01
1 1 4
1 2 3
2 1 6
2 2 8

但重塑后我希望变量的名称是这样的,所以时间1是_r1_(时间2是_r2_)并且它在中间姓名:

ID WSAS_r1_01 WSAS_r2_01
1 4 3
2 6 8

有人知道怎么做吗?

使用 pivot_wider(),您可以提供使用 names_from 列(和特殊 .value)创建自定义列名称的粘合规范。

library(tidyr)
library(stringr)

df %>%
  pivot_wider(
    names_from = time,
    names_glue = "{str_replace(.value, '(?=_)', str_c('_r', time))}",
    values_from = WSAS_01)

# # A tibble: 2 × 3
#      ID WSAS_r1_01 WSAS_r2_01
#   <int>      <int>      <int>
# 1     1          4          3
# 2     2          6          8

values_from 包含多个值的扩展情况下,此方法也适用:

df <- data.frame(
  ID = rep(1:2, each = 2),
  time = rep(1:2, 2),
  WSAS_01 = c(4, 3, 6, 8),
  WSAS_02 = c(1, 3, 5, 7)
)

df %>%
  pivot_wider(
    names_from = time,
    names_glue = "{str_replace(.value, '(?=_)', str_c('_r', time))}",
    values_from = starts_with("WSAS"))

# # A tibble: 2 × 5
#      ID WSAS_r1_01 WSAS_r2_01 WSAS_r1_02 WSAS_r2_02
#   <int>      <dbl>      <dbl>      <dbl>      <dbl>
# 1     1          4          3          1          3
# 2     2          6          8          5          7

你可以试试这个:

df %>%
  pivot_wider(
    id_cols=ID,
    names_from=time,
    values_from = WSAS_01,
    names_glue="{paste0(str_sub(.value,1,4),'_r', time,'_',str_sub(.value,6,7))}"
)

输出:

# A tibble: 2 × 3
     ID WSAS_r1_01 WSAS_r2_01
  <dbl>      <dbl>      <dbl>
1     1          4          3
2     2          6          8

tidyr::pivot_wider 的另一种方式,但关于 names_glue 的部分更简洁:

library(tidyr)

df <- read.table(text = "
ID  time    WSAS_01
1   1   4
1   2   3
2   1   6
2   2   8", header=T)

df %>% 
  pivot_wider(names_from=time, values_from=WSAS_01, names_glue="WSAS_r{time}_01")

#> # A tibble: 2 × 3
#>      ID WSAS_r1_01 WSAS_r2_01
#>   <int>      <int>      <int>
#> 1     1          4          3
#> 2     2          6          8