R:使用 'spread' 函数旋转

R: Pivoting using 'spread' function

继续我之前的 ,我现在多了 1 列 ID 值,我需要使用它们将行转换为列。

    NUM <- c(1,2,3,1,2,3,1,2,3,1)
    ID <- c("DJ45","DJ45","DJ45","DJ46","DJ46","DJ46","DJ47","DJ47","DJ47","DJ48")
    Type <- c("A", "F", "C", "B", "D", "A", "E", "C", "F", "D")
    Points <- c(9.2,60.8,22.9,1012.7,18.7,11.1,67.2,63.1,16.7,58.4)

    df1 <- data.frame(ID,NUM,Type,Points)

df1:
    +------+-----+------+--------+
    | ID   | Num | Type | Points |
    +------+-----+------+--------+
    | DJ45 |   1 | A    | 9.2    |
    | DJ45 |   2 | F    | 60.8   |
    | DJ45 |   3 | C    | 22.9   |
    | DJ46 |   1 | B    | 1012.7 |
    | DJ46 |   2 | D    | 18.7   |
    | DJ46 |   3 | A    | 11.1   |
    | DJ47 |   1 | E    | 67.2   |
    | DJ47 |   2 | C    | 63.1   |
    | DJ47 |   3 | F    | 16.7   |
    | DJ48 |   1 | D    | 58.4   |
    +------+-----+------+--------+

我想要的输出是

+------+-----+------+--------+------+------+------+------+
| ID   | Num |  A   |   B    |  C   |  D   |  E   |  F   |
+------+-----+------+--------+------+------+------+------+
| DJ45 |   1 | 9.2  | N/A    | N/A  | N/A  | N/A  | N/A  |
| DJ45 |   2 | N/A  | N/A    | N/A  | N/A  | N/A  | 60.8 |
| DJ45 |   3 | N/A  | N/A    | 22.9 | N/A  | N/A  | N/A  |
| DJ46 |   1 | N/A  | 1012.7 | N/A  | N/A  | N/A  | N/A  |
| DJ46 |   2 | N/A  | N/A    | N/A  | 18.7 | N/A  | N/A  |
| DJ46 |   3 | 11.1 | N/A    | N/A  | N/A  | N/A  | N/A  |
| DJ47 |   1 | N/A  | N/A    | N/A  | N/A  | 67.2 | N/A  |
| DJ47 |   2 | N/A  | N/A    | 63.1 | N/A  | N/A  | N/A  |
| DJ47 |   3 | N/A  | N/A    | N/A  | N/A  | N/A  | 16.7 |
| DJ48 |   1 | N/A  | N/A    | N/A  | 58.4 | N/A  | N/A  |
+------+-----+------+--------+------+------+------+------+

我在 R 中使用 spread 函数,但收到错误提示重复标识符。这是因为我现在有 2 列(ID 和 NUM),而不是以前的一列(NUM)。请让我知道我该怎么做。

不知道你试过什么,我建议:

spread(df1, Type, Points)
#      ID NUM    A      B    C    D    E    F
# 1  DJ45   1  9.2     NA   NA   NA   NA   NA
# 2  DJ45   2   NA     NA   NA   NA   NA 60.8
# 3  DJ45   3   NA     NA 22.9   NA   NA   NA
# 4  DJ46   1   NA 1012.7   NA   NA   NA   NA
# 5  DJ46   2   NA     NA   NA 18.7   NA   NA
# 6  DJ46   3 11.1     NA   NA   NA   NA   NA
# 7  DJ47   1   NA     NA   NA   NA 67.2   NA
# 8  DJ47   2   NA     NA 63.1   NA   NA   NA
# 9  DJ47   3   NA     NA   NA   NA   NA 16.7
# 10 DJ48   1   NA     NA   NA 58.4   NA   NA

如果您收到关于重复标识符的错误,那是因为您的实际数据中 "ID" 和 "Num" 的组合有一个或多个重复条目(在您的样本数据中,它们不't).

如果是这种情况,您需要添加另一列以使其唯一。

dplyr添加到链中,它可能是这样的:

df1 %>%
  group_by(ID, NUM) %>%
  mutate(id2 = sequence(n())) %>%
  spread(Type, Points)

的演示假定错误:

df2 <- rbind(df1, df1[1:3, ]) ## Duplicate the first three rows
spread(df2, Type, Points)
# Error: Duplicate identifiers for rows (1, 11), (3, 13), (2, 12)    

library(dplyr)

df2 %>%
  group_by(ID, NUM) %>%
  mutate(id2 = sequence(n())) %>%
  spread(Type, Points)
# Source: local data frame [13 x 9]
# 
#      ID NUM id2    A      B    C    D    E    F
# 1  DJ45   1   1  9.2     NA   NA   NA   NA   NA
# 2  DJ45   1   2  9.2     NA   NA   NA   NA   NA
# 3  DJ45   2   1   NA     NA   NA   NA   NA 60.8
# 4  DJ45   2   2   NA     NA   NA   NA   NA 60.8
# 5  DJ45   3   1   NA     NA 22.9   NA   NA   NA
# 6  DJ45   3   2   NA     NA 22.9   NA   NA   NA
# 7  DJ46   1   1   NA 1012.7   NA   NA   NA   NA
# 8  DJ46   2   1   NA     NA   NA 18.7   NA   NA
# 9  DJ46   3   1 11.1     NA   NA   NA   NA   NA
# 10 DJ47   1   1   NA     NA   NA   NA 67.2   NA
# 11 DJ47   2   1   NA     NA 63.1   NA   NA   NA
# 12 DJ47   3   1   NA     NA   NA   NA   NA 16.7
# 13 DJ48   1   1   NA     NA   NA 58.4   NA   NA