从嵌套的摘要列表(aov())中提取值到数据框中
Extract values from nested list of summary(aov()) into a dataframe
我正在 运行 在单个数据框中跨多个组进行简单的单向方差分析。
此处提供数据框:https://www.dropbox.com/s/6nsjk4l1pgiwal3/cut1.csv?dl=0
>download.file('https://www.dropbox.com/s/6nsjk4l1pgiwal3/cut1.csv?raw=1', destfile = "cut1.csv", method = "auto")
> data <- read.csv("cut1.csv")
> cut1 <- data %>% mutate(Plot = as.factor(Plot), Block = as.factor(Block), Cut = as.factor(Cut))
> str(cut1)
'data.frame': 160 obs. of 6 variables:
$ Plot : Factor w/ 16 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
$ Block : Factor w/ 4 levels "1","2","3","4": 1 1 1 1 2 2 2 2 3 3 ...
$ Treatment : Factor w/ 4 levels "AN","C","IU",..: 4 2 3 1 1 3 4 2 3 1 ...
$ Cut : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
$ Measurement: Factor w/ 10 levels "ADF","Ash","Crude_Protein",..: 5 5 5 5 5 5 5 5 5 5 ...
$ Value : num 956 965 961 963 955 ...
我使用了 this SO question 中的一些代码来使 aov 函数应用于 Measurement
因子的每个级别:
anova_1<- sapply(unique(as.character(cut1$Measurement)),
function(meas)aov(Value~Treatment+Block,cut1,subset=(Measurement==meas)),
simplify=FALSE,USE.NAMES=TRUE)
summary_1 <- lapply(anova_1, summary)
我可以通过 summary_1
手动查看,但理想情况下我想做的是将 Measurement
因子的每个级别的 p 值提取到一个数据框中,然后我可以对其进行过滤,以便我只看哪些<0.5。然后我会 运行 TukeyHSD
这些。
summary_1
看起来像这样(仅显示前 2 个列表):
> str(summary_1)
List of 10
$ Dry_matter :List of 1
..$ :Classes ‘anova’ and 'data.frame': 3 obs. of 5 variables:
.. ..$ Df : num [1:3] 3 3 9
.. ..$ Sum Sq : num [1:3] 359 167 612
.. ..$ Mean Sq: num [1:3] 119.8 55.5 68
.. ..$ F value: num [1:3] 1.761 0.816 NA
.. ..$ Pr(>F) : num [1:3] 0.224 0.517 NA
..- attr(*, "class")= chr [1:2] "summary.aov" "listof"
$ Crude_Protein:List of 1
..$ :Classes ‘anova’ and 'data.frame': 3 obs. of 5 variables:
.. ..$ Df : num [1:3] 3 3 9
.. ..$ Sum Sq : num [1:3] 306 721 1606
.. ..$ Mean Sq: num [1:3] 102 240 178
.. ..$ F value: num [1:3] 0.572 1.347 NA
.. ..$ Pr(>F) : num [1:3] 0.647 0.319 NA
..- attr(*, "class")= chr [1:2] "summary.aov" "listof"
我可以像这样从 summary_1
中的一个列表中提取 p 值:
> summary_1$OAH[[1]][,5][1]
[1] 0.4734992
但是,我不知道如何从所有嵌套列表中提取并放入数据框中。
非常感谢您的帮助。
您可以将包 broom
与 dplyr
结合使用,通过 Measurement
应用 Anova
,并将输出分配给 data.frame
整齐的格式。
library(broom)
library(dplyr)
summaries <- cut1 %>% group_by(Measurement) %>%
do(tidy(aov(Value ~ Treatment + Block, data = .)))
head(summaries)
# Measurement term df sumsq meansq statistic p.value
# (fctr) (chr) (dbl) (dbl) (dbl) (dbl) (dbl)
#1 ADF Treatment 3 41.416875 13.805625 3.097871 0.07138437
#2 ADF Block 1 8.001125 8.001125 1.795388 0.20729351
#3 ADF Residuals 11 49.021375 4.456489 NA NA
#4 Ash Treatment 3 38.511875 12.837292 1.051787 0.40840601
#5 Ash Block 1 34.980125 34.980125 2.865998 0.11856463
#6 Ash Residuals 11 134.257375 12.205216 NA NA
这是香草 R 中的解决方案:
# you can shorten your example -- download.file not necessary
cut1 <- read.csv('https://www.dropbox.com/s/6nsjk4l1pgiwal3/cut1.csv?raw=1') %>%
mutate(Plot = as.factor(Plot), Block = as.factor(Block), Cut = as.factor(Cut))
# split-apply-combine strategy
do.call(rbind, lapply(split(cut1,cut1$Measurement),
function(x) with(x, summary(aov(Value ~ Treatment + Block)))[[1]]
)
)
returns:
Df Sum Sq Mean Sq F value Pr(>F)
ADF.Treatment 3 41.42 13.81 6.7088 0.01133 *
ADF.Block 3 38.50 12.83 6.2366 0.01405 *
ADF.Residuals 9 18.52 2.06
Ash.Treatment 3 38.51 12.84 0.9162 0.47115
Ash.Block 3 43.13 14.38 1.0261 0.42602
Ash.Residuals 9 126.11 14.01
Crude_Protein.Treatment 3 306.42 102.14 0.5723 0.64733
Crude_Protein.Block 3 721.42 240.47 1.3473 0.31946
Crude_Protein.Residuals 9 1606.39 178.49
D.Treatment 3 9.47 3.16 4.5530 0.03331 *
D.Block 3 7.57 2.52 3.6383 0.05751 .
D.Residuals 9 6.24 0.69
Dry_matter.Treatment 3 359.39 119.80 1.7609 0.22432
Dry_matter.Block 3 166.62 55.54 0.8164 0.51656
Dry_matter.Residuals 9 612.27 68.03
ME.Treatment 3 0.24 0.08 4.5530 0.03331 *
ME.Block 3 0.19 0.06 3.6383 0.05751 .
ME.Residuals 9 0.16 0.02
NCGD.Treatment 3 2777.57 925.86 4.5530 0.03331 *
NCGD.Block 3 2219.55 739.85 3.6383 0.05751 .
NCGD.Residuals 9 1830.17 203.35
NDF.Treatment 3 355.91 118.64 6.8809 0.01050 *
NDF.Block 3 336.70 112.23 6.5095 0.01239 *
NDF.Residuals 9 155.17 17.24
OAH.Treatment 3 1.41 0.47 0.9108 0.47350
OAH.Block 3 1.37 0.46 0.8850 0.48488
OAH.Residuals 9 4.65 0.52
Sugar.Treatment 3 86.18 28.73 5.0212 0.02577 *
Sugar.Block 3 51.64 17.21 3.0085 0.08720 .
Sugar.Residuals 9 51.49 5.72
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
我正在 运行 在单个数据框中跨多个组进行简单的单向方差分析。
此处提供数据框:https://www.dropbox.com/s/6nsjk4l1pgiwal3/cut1.csv?dl=0
>download.file('https://www.dropbox.com/s/6nsjk4l1pgiwal3/cut1.csv?raw=1', destfile = "cut1.csv", method = "auto")
> data <- read.csv("cut1.csv")
> cut1 <- data %>% mutate(Plot = as.factor(Plot), Block = as.factor(Block), Cut = as.factor(Cut))
> str(cut1)
'data.frame': 160 obs. of 6 variables:
$ Plot : Factor w/ 16 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
$ Block : Factor w/ 4 levels "1","2","3","4": 1 1 1 1 2 2 2 2 3 3 ...
$ Treatment : Factor w/ 4 levels "AN","C","IU",..: 4 2 3 1 1 3 4 2 3 1 ...
$ Cut : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
$ Measurement: Factor w/ 10 levels "ADF","Ash","Crude_Protein",..: 5 5 5 5 5 5 5 5 5 5 ...
$ Value : num 956 965 961 963 955 ...
我使用了 this SO question 中的一些代码来使 aov 函数应用于 Measurement
因子的每个级别:
anova_1<- sapply(unique(as.character(cut1$Measurement)),
function(meas)aov(Value~Treatment+Block,cut1,subset=(Measurement==meas)),
simplify=FALSE,USE.NAMES=TRUE)
summary_1 <- lapply(anova_1, summary)
我可以通过 summary_1
手动查看,但理想情况下我想做的是将 Measurement
因子的每个级别的 p 值提取到一个数据框中,然后我可以对其进行过滤,以便我只看哪些<0.5。然后我会 运行 TukeyHSD
这些。
summary_1
看起来像这样(仅显示前 2 个列表):
> str(summary_1)
List of 10
$ Dry_matter :List of 1
..$ :Classes ‘anova’ and 'data.frame': 3 obs. of 5 variables:
.. ..$ Df : num [1:3] 3 3 9
.. ..$ Sum Sq : num [1:3] 359 167 612
.. ..$ Mean Sq: num [1:3] 119.8 55.5 68
.. ..$ F value: num [1:3] 1.761 0.816 NA
.. ..$ Pr(>F) : num [1:3] 0.224 0.517 NA
..- attr(*, "class")= chr [1:2] "summary.aov" "listof"
$ Crude_Protein:List of 1
..$ :Classes ‘anova’ and 'data.frame': 3 obs. of 5 variables:
.. ..$ Df : num [1:3] 3 3 9
.. ..$ Sum Sq : num [1:3] 306 721 1606
.. ..$ Mean Sq: num [1:3] 102 240 178
.. ..$ F value: num [1:3] 0.572 1.347 NA
.. ..$ Pr(>F) : num [1:3] 0.647 0.319 NA
..- attr(*, "class")= chr [1:2] "summary.aov" "listof"
我可以像这样从 summary_1
中的一个列表中提取 p 值:
> summary_1$OAH[[1]][,5][1]
[1] 0.4734992
但是,我不知道如何从所有嵌套列表中提取并放入数据框中。
非常感谢您的帮助。
您可以将包 broom
与 dplyr
结合使用,通过 Measurement
应用 Anova
,并将输出分配给 data.frame
整齐的格式。
library(broom)
library(dplyr)
summaries <- cut1 %>% group_by(Measurement) %>%
do(tidy(aov(Value ~ Treatment + Block, data = .)))
head(summaries)
# Measurement term df sumsq meansq statistic p.value
# (fctr) (chr) (dbl) (dbl) (dbl) (dbl) (dbl)
#1 ADF Treatment 3 41.416875 13.805625 3.097871 0.07138437
#2 ADF Block 1 8.001125 8.001125 1.795388 0.20729351
#3 ADF Residuals 11 49.021375 4.456489 NA NA
#4 Ash Treatment 3 38.511875 12.837292 1.051787 0.40840601
#5 Ash Block 1 34.980125 34.980125 2.865998 0.11856463
#6 Ash Residuals 11 134.257375 12.205216 NA NA
这是香草 R 中的解决方案:
# you can shorten your example -- download.file not necessary
cut1 <- read.csv('https://www.dropbox.com/s/6nsjk4l1pgiwal3/cut1.csv?raw=1') %>%
mutate(Plot = as.factor(Plot), Block = as.factor(Block), Cut = as.factor(Cut))
# split-apply-combine strategy
do.call(rbind, lapply(split(cut1,cut1$Measurement),
function(x) with(x, summary(aov(Value ~ Treatment + Block)))[[1]]
)
)
returns:
Df Sum Sq Mean Sq F value Pr(>F)
ADF.Treatment 3 41.42 13.81 6.7088 0.01133 *
ADF.Block 3 38.50 12.83 6.2366 0.01405 *
ADF.Residuals 9 18.52 2.06
Ash.Treatment 3 38.51 12.84 0.9162 0.47115
Ash.Block 3 43.13 14.38 1.0261 0.42602
Ash.Residuals 9 126.11 14.01
Crude_Protein.Treatment 3 306.42 102.14 0.5723 0.64733
Crude_Protein.Block 3 721.42 240.47 1.3473 0.31946
Crude_Protein.Residuals 9 1606.39 178.49
D.Treatment 3 9.47 3.16 4.5530 0.03331 *
D.Block 3 7.57 2.52 3.6383 0.05751 .
D.Residuals 9 6.24 0.69
Dry_matter.Treatment 3 359.39 119.80 1.7609 0.22432
Dry_matter.Block 3 166.62 55.54 0.8164 0.51656
Dry_matter.Residuals 9 612.27 68.03
ME.Treatment 3 0.24 0.08 4.5530 0.03331 *
ME.Block 3 0.19 0.06 3.6383 0.05751 .
ME.Residuals 9 0.16 0.02
NCGD.Treatment 3 2777.57 925.86 4.5530 0.03331 *
NCGD.Block 3 2219.55 739.85 3.6383 0.05751 .
NCGD.Residuals 9 1830.17 203.35
NDF.Treatment 3 355.91 118.64 6.8809 0.01050 *
NDF.Block 3 336.70 112.23 6.5095 0.01239 *
NDF.Residuals 9 155.17 17.24
OAH.Treatment 3 1.41 0.47 0.9108 0.47350
OAH.Block 3 1.37 0.46 0.8850 0.48488
OAH.Residuals 9 4.65 0.52
Sugar.Treatment 3 86.18 28.73 5.0212 0.02577 *
Sugar.Block 3 51.64 17.21 3.0085 0.08720 .
Sugar.Residuals 9 51.49 5.72
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1