如何在 dcast 中指定 ID 变量?
How to specify ID variables in dcast?
以下是我的数据示例
datahave
# A tibble: 6 x 6
YEAR SCHOOL_NAME CONTENT_AREA BELOW_BASIC_PCT BASIC_PCT ADVANCED_PCT
<dbl> <chr> <chr> <chr> <chr> <chr>
1 2015 5TH AND 6TH GRADE CTR. Eng. Language Arts 38.1 28.3 10.1
2 2015 5TH AND 6TH GRADE CTR. Mathematics 39 30.3 14.6
3 2015 5TH AND 6TH GRADE CTR. Science 25.4 41.7 12.3
4 2015 6TH GRADE CENTER Eng. Language Arts 7.6 27.8 21.8
5 2015 6TH GRADE CENTER Mathematics 19.100000000000001 37.700000000000003 17.5
6 2015 7th and 8th Grade Center Eng. Language Arts 52.1 27.4 1.7
以下是与此类似的可重现示例
school<-c("A","A",'A','B','B','B')
content_area<-c('english','math','science','english','math','science')
below_basic<-c(20,30,40,10,15,20)
advanced<-c(2,5,3,1,2.5,1.5)
df<-data.frame(school,content_area,below_basic,advanced)
df
和运行下面的代码放在上面
library(reshape2)
dcast(melt(df), school ~ content_area + variable)
这给了我想要的输出,因为它使用的是 Using school, content_area as id variables
然而,当我 运行 原始数据集上的相同代码时
dcast(melt(datahave), SCHOOL_NAME ~ CONTENT_AREA + variable)
实际上是在使用Using SCHOOL_NAME, CONTENT_AREA, BELOW_BASIC_PCT, BASIC_PCT, ADVANCED_PCT as id variables
如何指定哪些列可以用作 ID 变量?所以我得到类似于可重现示例的输出。
我们可以在melt
中指定id.var
,否则,它会根据类型自动选择变量。
library(reshape2)
dcast(melt(datahave, id.var = c("YEAR", "SCHOOL_NAME", "CONTENT_AREA")),
SCHOOL_NAME ~ CONTENT_AREA + variable)
# SCHOOL_NAME Eng. Language Arts_BELOW_BASIC_PCT Eng. Language Arts_BASIC_PCT
#1 5TH AND 6TH GRADE CTR. 38.1 28.3
#2 6TH GRADE CENTER 7.6 27.8
#3 7th and 8th Grade Center 52.1 27.4
# Eng. Language Arts_ADVANCED_PCT Mathematics_BELOW_BASIC_PCT Mathematics_BASIC_PCT Mathematics_ADVANCED_PCT
#1 10.1 39.0 30.3 14.6
#2 21.8 19.1 37.7 17.5
#3 1.7 NA NA NA
# Science_BELOW_BASIC_PCT Science_BASIC_PCT Science_ADVANCED_PCT
#1 25.4 41.7 12.3
#2 NA NA NA
#3 NA NA NA
melt/dcast
包装器 recast
也可以使用
recast(datahave, id.var = c("YEAR", "SCHOOL_NAME", "CONTENT_AREA"),
SCHOOL_NAME ~ CONTENT_AREA + variable)
数据
datahave <- structure(list(YEAR = c(2015L, 2015L, 2015L, 2015L, 2015L, 2015L
), SCHOOL_NAME = c("5TH AND 6TH GRADE CTR.", "5TH AND 6TH GRADE CTR.",
"5TH AND 6TH GRADE CTR.", "6TH GRADE CENTER", "6TH GRADE CENTER",
"7th and 8th Grade Center"), CONTENT_AREA = c("Eng. Language Arts",
"Mathematics", "Science", "Eng. Language Arts", "Mathematics",
"Eng. Language Arts"), BELOW_BASIC_PCT = c(38.1, 39, 25.4, 7.6,
19.1, 52.1), BASIC_PCT = c(28.3, 30.3, 41.7, 27.8, 37.7, 27.4
), ADVANCED_PCT = c(10.1, 14.6, 12.3, 21.8, 17.5, 1.7)),
class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))
以下是我的数据示例
datahave
# A tibble: 6 x 6
YEAR SCHOOL_NAME CONTENT_AREA BELOW_BASIC_PCT BASIC_PCT ADVANCED_PCT
<dbl> <chr> <chr> <chr> <chr> <chr>
1 2015 5TH AND 6TH GRADE CTR. Eng. Language Arts 38.1 28.3 10.1
2 2015 5TH AND 6TH GRADE CTR. Mathematics 39 30.3 14.6
3 2015 5TH AND 6TH GRADE CTR. Science 25.4 41.7 12.3
4 2015 6TH GRADE CENTER Eng. Language Arts 7.6 27.8 21.8
5 2015 6TH GRADE CENTER Mathematics 19.100000000000001 37.700000000000003 17.5
6 2015 7th and 8th Grade Center Eng. Language Arts 52.1 27.4 1.7
以下是与此类似的可重现示例
school<-c("A","A",'A','B','B','B')
content_area<-c('english','math','science','english','math','science')
below_basic<-c(20,30,40,10,15,20)
advanced<-c(2,5,3,1,2.5,1.5)
df<-data.frame(school,content_area,below_basic,advanced)
df
和运行下面的代码放在上面
library(reshape2)
dcast(melt(df), school ~ content_area + variable)
这给了我想要的输出,因为它使用的是 Using school, content_area as id variables
然而,当我 运行 原始数据集上的相同代码时
dcast(melt(datahave), SCHOOL_NAME ~ CONTENT_AREA + variable)
实际上是在使用Using SCHOOL_NAME, CONTENT_AREA, BELOW_BASIC_PCT, BASIC_PCT, ADVANCED_PCT as id variables
如何指定哪些列可以用作 ID 变量?所以我得到类似于可重现示例的输出。
我们可以在melt
中指定id.var
,否则,它会根据类型自动选择变量。
library(reshape2)
dcast(melt(datahave, id.var = c("YEAR", "SCHOOL_NAME", "CONTENT_AREA")),
SCHOOL_NAME ~ CONTENT_AREA + variable)
# SCHOOL_NAME Eng. Language Arts_BELOW_BASIC_PCT Eng. Language Arts_BASIC_PCT
#1 5TH AND 6TH GRADE CTR. 38.1 28.3
#2 6TH GRADE CENTER 7.6 27.8
#3 7th and 8th Grade Center 52.1 27.4
# Eng. Language Arts_ADVANCED_PCT Mathematics_BELOW_BASIC_PCT Mathematics_BASIC_PCT Mathematics_ADVANCED_PCT
#1 10.1 39.0 30.3 14.6
#2 21.8 19.1 37.7 17.5
#3 1.7 NA NA NA
# Science_BELOW_BASIC_PCT Science_BASIC_PCT Science_ADVANCED_PCT
#1 25.4 41.7 12.3
#2 NA NA NA
#3 NA NA NA
melt/dcast
包装器 recast
也可以使用
recast(datahave, id.var = c("YEAR", "SCHOOL_NAME", "CONTENT_AREA"),
SCHOOL_NAME ~ CONTENT_AREA + variable)
数据
datahave <- structure(list(YEAR = c(2015L, 2015L, 2015L, 2015L, 2015L, 2015L
), SCHOOL_NAME = c("5TH AND 6TH GRADE CTR.", "5TH AND 6TH GRADE CTR.",
"5TH AND 6TH GRADE CTR.", "6TH GRADE CENTER", "6TH GRADE CENTER",
"7th and 8th Grade Center"), CONTENT_AREA = c("Eng. Language Arts",
"Mathematics", "Science", "Eng. Language Arts", "Mathematics",
"Eng. Language Arts"), BELOW_BASIC_PCT = c(38.1, 39, 25.4, 7.6,
19.1, 52.1), BASIC_PCT = c(28.3, 30.3, 41.7, 27.8, 37.7, 27.4
), ADVANCED_PCT = c(10.1, 14.6, 12.3, 21.8, 17.5, 1.7)),
class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))