从长格式转换为宽格式,每次重复创建一个新行

Casting from Long to Wide format, with each repeat creating a new row

我有一个长格式的数据框,我想将其转换为宽格式。数据框有几个重复的标识符,我想将它们视为唯一实例,并将它们表示为宽数据框中的单独行。

我的问题和这个类似:

Forcing unique values before casting (pivoting) in R

但在上述问题中,唯一条目最终作为单独的列结束。对于我的问题,我想将数据放入单独的行中。例如:

ID1<-c("A","A","A","A","A","B","B","B","B","B","C","C","C","C","C")

ID2<-c("R","R","R","L","L","R","R","L","L","R","R","L","L","R","R")

Sp<-c("Bird","Cat","Bird","Bird","Dog","Dog","Dog","Cat","Cat","Bird","Cat","Dog","Bird","Bird","Cat")

Count<-c(1,2,2,1,2,1,2,3,2,1,2,3,2,1,5)

DF<-data.frame(ID1,ID2,Sp,Count)

将数据转换为宽格式后,我希望输出数据如下所示:

ID1    ID2    Bird  Cat  Dog
A      R       1     2    0
A      R       2     0    0 # 2 Birds in the A/ R combination so need second row (don't want to add them together)
A      L       1     0    2
B      R       1     0    1
B      R       0     0    2
B      L       0     3    0
B      L       0     2    0
C      R       1     2    0
C      R       0     5    0
C      L       2     0    3

如果唯一 ID1/ID2 组合中没有重复,则转换将正常进行。但如果有重复,则会创建第二(或第三或第四)行。

您可以为每组 ID1ID2Sp 创建一个辅助 ID 列,然后用 ID1ID2AUXID 作为 id 列:

library(dplyr)
DF = DF %>% group_by(ID1, ID2, Sp) %>% mutate(AUXID = row_number()) %>% as.data.frame()
reshape(DF, idvar = c("ID1", "ID2", "AUXID"), timevar = "Sp", dir = "wide")

#    ID1 ID2 AUXID Count.Bird Count.Cat Count.Dog
# 1    A   R     1          1         2        NA
# 3    A   R     2          2        NA        NA
# 4    A   L     1          1        NA         2
# 6    B   R     1          1        NA         1
# 7    B   R     2         NA        NA         2
# 8    B   L     1         NA         3        NA
# 9    B   L     2         NA         2        NA
# 11   C   R     1          1         2        NA
# 12   C   L     1          2        NA         3
# 15   C   R     2         NA         5        NA

您可以删除 AUXID 列,然后填写 NA

这是一个带有 dcast() 的 data.table 版本,它提供了一个 fill 参数来填充 NA 值:

library(data.table)
(dcast(setDT(DF)[, AUXID := 1:.N, .(ID1, ID2, Sp)], 
      ID1 + ID2 + AUXID ~ Sp, value.var = "Count", fill = 0)
      [, AUXID := NULL][])
#    ID1 ID2 Bird Cat Dog
# 1:   A   L    1   0   2
# 2:   A   R    1   2   0
# 3:   A   R    2   0   0
# 4:   B   L    0   3   0
# 5:   B   L    0   2   0
# 6:   B   R    1   0   1
# 7:   B   R    0   0   2
# 8:   C   L    2   0   3
# 9:   C   R    1   2   0
#10:   C   R    0   5   0