R中列的行值

Question

我是 R 的新手，我正在尝试构建我的第一个回归模型。但是，我正在努力转换我的数据。

我的数据按以下格式组织：

resp_id  task_id  alt  A_1  B_1   C_1   D_1   E_1
1        25       1    3    0.4   0.15  0     0
1        25       2    2    0.7   0.05  0.05  0
1        26       1    1    0.4   0     0     0
1        26       2    3    0.4   0.05  0.1   0.05

我正在寻找一种方法将我的数据从上面的格式转换为下面的格式：

resp_id  task_id  alt  A_1  B_1   C_1   D_1   E_1   A_2  B_2  C_2  D_2  E_2
1        25       1    3    0.4   0.15  0     0     2    0.7  0.05 0.05 0
1        26       1    1    0.4   0     0     0     3    0.4  0.05 0.1  0.05

从概念上讲，我知道我需要遍历每一行，直到我们到达值为 2 的 'alt' 列。然后，该行中的所有下一列值都需要作为新列复制到该行before 和从中复制值的行需要删除。

我研究了在 R 中从长数据集到宽数据集的方法，但我无法将我的数据集转换成我想要的。

鉴于我缺乏编程经验，有人可以帮助我吗？

Answer 1

这是 tidyr 包中 pivot_wider 的工作：

library(tidyverse)

df %>%
  # remove the existing suffix and instead use alt to enumerate the columns
  rename_at(vars(A_1:E_1), ~gsub("_[0-9]*$", "", .)) %>%
  pivot_wider(names_from = alt, values_from = A:E)

结果：

# A tibble: 2 x 12
  resp_id task_id   A_1   A_2   B_1   B_2   C_1   C_2   D_1   D_2   E_1   E_2
    <int>   <int> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1       1      25     3     2   0.4   0.7  0.15  0.05     0  0.05     0  0   
2       1      26     1     3   0.4   0.4  0     0.05     0  0.1      0  0.05

Answer 2

前段时间我也在为这样的转变而苦苦挣扎。我一直在寻找最简单的解决方案。在这种情况下，我想推荐 reshape

reshape(df, direction = "wide", timevar = "alt", idvar = "task_id", sep = "")

  task_id resp_id1 A_11 B_11 C_11 D_11 E_11 resp_id2 A_12 B_12 C_12 D_12 E_12
1      25        1    3  0.4 0.15    0    0        1    2  0.7 0.05 0.05 0.00
3      26        1    1  0.4 0.00    0    0        1    3  0.4 0.05 0.10 0.05

R中列的行值

Row values to columns in R

r

data-transform