如何在 dcast 中指定 ID 变量?

How to specify ID variables in dcast?

以下是我的数据示例

datahave
# A tibble: 6 x 6
   YEAR SCHOOL_NAME              CONTENT_AREA       BELOW_BASIC_PCT    BASIC_PCT          ADVANCED_PCT
  <dbl> <chr>                    <chr>              <chr>              <chr>              <chr>       
1  2015 5TH AND 6TH GRADE CTR.   Eng. Language Arts 38.1               28.3               10.1        
2  2015 5TH AND 6TH GRADE CTR.   Mathematics        39                 30.3               14.6        
3  2015 5TH AND 6TH GRADE CTR.   Science            25.4               41.7               12.3        
4  2015 6TH GRADE CENTER         Eng. Language Arts 7.6                27.8               21.8        
5  2015 6TH GRADE CENTER         Mathematics        19.100000000000001 37.700000000000003 17.5        
6  2015 7th and 8th Grade Center Eng. Language Arts 52.1               27.4               1.7     

以下是与此类似的可重现示例

school<-c("A","A",'A','B','B','B')
content_area<-c('english','math','science','english','math','science')
below_basic<-c(20,30,40,10,15,20)
advanced<-c(2,5,3,1,2.5,1.5)


df<-data.frame(school,content_area,below_basic,advanced)
df

和运行下面的代码放在上面

library(reshape2)
dcast(melt(df), school ~ content_area + variable)

这给了我想要的输出,因为它使用的是 Using school, content_area as id variables

然而,当我 运行 原始数据集上的相同代码时

dcast(melt(datahave), SCHOOL_NAME ~ CONTENT_AREA + variable)

实际上是在使用Using SCHOOL_NAME, CONTENT_AREA, BELOW_BASIC_PCT, BASIC_PCT, ADVANCED_PCT as id variables

如何指定哪些列可以用作 ID 变量?所以我得到类似于可重现示例的输出。

我们可以在melt中指定id.var,否则,它会根据类型自动选择变量。

library(reshape2)
dcast(melt(datahave, id.var = c("YEAR", "SCHOOL_NAME", "CONTENT_AREA")), 
              SCHOOL_NAME ~ CONTENT_AREA + variable)
#           SCHOOL_NAME Eng. Language Arts_BELOW_BASIC_PCT Eng. Language Arts_BASIC_PCT
#1   5TH AND 6TH GRADE CTR.                               38.1                         28.3
#2         6TH GRADE CENTER                                7.6                         27.8
#3 7th and 8th Grade Center                               52.1                         27.4
#  Eng. Language Arts_ADVANCED_PCT Mathematics_BELOW_BASIC_PCT Mathematics_BASIC_PCT Mathematics_ADVANCED_PCT
#1                            10.1                        39.0                  30.3                     14.6
#2                            21.8                        19.1                  37.7                     17.5
#3                             1.7                          NA                    NA                       NA
#  Science_BELOW_BASIC_PCT Science_BASIC_PCT Science_ADVANCED_PCT
#1                    25.4              41.7                 12.3
#2                      NA                NA                   NA
#3                      NA                NA                   NA

melt/dcast 包装器 recast 也可以使用

recast(datahave, id.var = c("YEAR", "SCHOOL_NAME", "CONTENT_AREA"), 
       SCHOOL_NAME ~ CONTENT_AREA + variable)

数据

datahave <- structure(list(YEAR = c(2015L, 2015L, 2015L, 2015L, 2015L, 2015L
), SCHOOL_NAME = c("5TH AND 6TH GRADE CTR.", "5TH AND 6TH GRADE CTR.", 
"5TH AND 6TH GRADE CTR.", "6TH GRADE CENTER", "6TH GRADE CENTER", 
"7th and 8th Grade Center"), CONTENT_AREA = c("Eng. Language Arts", 
"Mathematics", "Science", "Eng. Language Arts", "Mathematics", 
"Eng. Language Arts"), BELOW_BASIC_PCT = c(38.1, 39, 25.4, 7.6, 
19.1, 52.1), BASIC_PCT = c(28.3, 30.3, 41.7, 27.8, 37.7, 27.4
), ADVANCED_PCT = c(10.1, 14.6, 12.3, 21.8, 17.5, 1.7)), 
class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6"))