R,如何在数据帧循环中做方差分析?
R, how to do anova in a loop over dataframe?
我想做 2ways anova,并存储 p 值,而不是 tukey hsd,但我对初始 table 有疑问。我并不总是拥有完整的数据,因此并非总是可以执行方差分析,我不知道如何执行此操作以便我的脚本运行,而不是跳过不完整的数据并进一步运行。我的数据如下所示:
https://filebin.net/w5cfuwztae7yk747
在link中有两个种质的例子,但在实际数据中有 3013 个种质,其中一些不具备所有光照条件或所有基因型
67822 AT2G41680 f HL_f_Dejan58 1.240108e+06 HL AT2G41680 f
70136 AT2G41680 f HL_f_Dejan_61 3.384010e+06 HL AT2G41680 f
72450 AT2G41680 ntrc HL_ntrc_ Dejan_62 1.410768e+05 HL AT2G41680 ntrc
74764 AT2G41680 ntrc HL_ntrc_Dejan_66 5.642197e+00 HL AT2G41680 ntrc
77078 AT2G41680 ntrc HL_ntrc_Dejan65 3.921952e+05 HL AT2G41680 ntrc
78997 AT2G41680 WT LL_WT_Dejan_41 1.016001e+07 LL AT2G41680 WT
81433 AT2G41680 WT LL_WT_Dejan_43 9.320892e+06 LL AT2G41680 WT
83869 AT2G41680 WT LL_WT_Dejan_49 8.560308e+06 LL AT2G41680 WT
有 4 种基因型和四种光照条件
我正在尝试做这样的事情:
AOV<- data.frame()
IDs<- unique(Dejan_all_new_norm$Accession)
for (i in 1 : length(IDs)){
temp<-Dejan_all_new_norm[(Dejan_all_new_norm$Accession)==IDs[i],]
aov2<-aov(value ~ genotype + Light + genotype:Light, data = temp)
AOV <- rbind(as.character(unique(IDs[i])),aov2,AOV)
}
所以我想对每个基因(加入)进行子集化,而不是做方差分析,但在此之后我想做 tukey 有这样的东西:
$`genotype:Light`
diff lwr upr p adj
m:FL-f:FL -7324259.81 -16715470 2066950.5 0.3486778
ntrc:FL-f:FL 1662873.54 -7728337 11054083.9 0.9999998
WT:FL-f:FL -5219263.59 -13913835 3475307.7 0.7927417
f:HL-f:FL -4936680.12 -13871535 3998174.3 0.8796738
m:HL-f:FL -7389937.49 -16324792 1544916.9 0.2496858
ntrc:HL-f:FL -7122962.46 -16057817 1811891.9 0.3102106
我想在这个简单的循环上工作,这是我的示例,因为这似乎是最简单的方法。
我将不胜感激任何帮助!
这是您要找的吗:
library(tidyverse)
library(broom)
read_csv(file = "https://filebin.net/w5cfuwztae7yk747/two.csv") %>%
group_by(Accession) %>%
do(broom::tidy(TukeyHSD(aov(value ~ genotype + Light + genotype:Light, data = .)))) %>%
ungroup
输出:
# A tibble: 264 x 7
Accession term comparison estimate conf.low conf.high adj.p.value
<chr> <fctr> <chr> <dbl> <dbl> <dbl> <dbl>
1 AT2G41680 genotype m-f -1586182.59 -3616647.7 444282.5 1.708496e-01
2 AT2G41680 genotype ntrc-f -5705550.95 -7694992.3 -3716109.6 2.609223e-08
3 AT2G41680 genotype WT-f -1568375.95 -3557817.3 421065.4 1.647950e-01
4 AT2G41680 genotype ntrc-m -4119368.37 -6149833.5 -2088903.3 2.214399e-05
5 AT2G41680 genotype WT-m 17806.64 -2012658.5 2048271.8 9.999951e-01
6 AT2G41680 genotype WT-ntrc 4137175.00 2147733.6 6126616.4 1.464605e-05
7 AT2G41680 Light HL-FL -3854435.85 -5849789.4 -1859082.3 4.872013e-05
8 AT2G41680 Light LL-FL 1528123.46 -467230.1 3523477.0 1.844033e-01
9 AT2G41680 Light ML-FL -2821752.94 -4775345.6 -868160.3 2.283331e-03
10 AT2G41680 Light LL-HL 5382559.31 3311883.1 7453235.6 2.176770e-07
# ... with 254 more rows
我想做 2ways anova,并存储 p 值,而不是 tukey hsd,但我对初始 table 有疑问。我并不总是拥有完整的数据,因此并非总是可以执行方差分析,我不知道如何执行此操作以便我的脚本运行,而不是跳过不完整的数据并进一步运行。我的数据如下所示:
https://filebin.net/w5cfuwztae7yk747
在link中有两个种质的例子,但在实际数据中有 3013 个种质,其中一些不具备所有光照条件或所有基因型
67822 AT2G41680 f HL_f_Dejan58 1.240108e+06 HL AT2G41680 f
70136 AT2G41680 f HL_f_Dejan_61 3.384010e+06 HL AT2G41680 f
72450 AT2G41680 ntrc HL_ntrc_ Dejan_62 1.410768e+05 HL AT2G41680 ntrc
74764 AT2G41680 ntrc HL_ntrc_Dejan_66 5.642197e+00 HL AT2G41680 ntrc
77078 AT2G41680 ntrc HL_ntrc_Dejan65 3.921952e+05 HL AT2G41680 ntrc
78997 AT2G41680 WT LL_WT_Dejan_41 1.016001e+07 LL AT2G41680 WT
81433 AT2G41680 WT LL_WT_Dejan_43 9.320892e+06 LL AT2G41680 WT
83869 AT2G41680 WT LL_WT_Dejan_49 8.560308e+06 LL AT2G41680 WT
有 4 种基因型和四种光照条件 我正在尝试做这样的事情:
AOV<- data.frame()
IDs<- unique(Dejan_all_new_norm$Accession)
for (i in 1 : length(IDs)){
temp<-Dejan_all_new_norm[(Dejan_all_new_norm$Accession)==IDs[i],]
aov2<-aov(value ~ genotype + Light + genotype:Light, data = temp)
AOV <- rbind(as.character(unique(IDs[i])),aov2,AOV)
}
所以我想对每个基因(加入)进行子集化,而不是做方差分析,但在此之后我想做 tukey 有这样的东西:
$`genotype:Light`
diff lwr upr p adj
m:FL-f:FL -7324259.81 -16715470 2066950.5 0.3486778
ntrc:FL-f:FL 1662873.54 -7728337 11054083.9 0.9999998
WT:FL-f:FL -5219263.59 -13913835 3475307.7 0.7927417
f:HL-f:FL -4936680.12 -13871535 3998174.3 0.8796738
m:HL-f:FL -7389937.49 -16324792 1544916.9 0.2496858
ntrc:HL-f:FL -7122962.46 -16057817 1811891.9 0.3102106
我想在这个简单的循环上工作,这是我的示例,因为这似乎是最简单的方法。 我将不胜感激任何帮助!
这是您要找的吗:
library(tidyverse)
library(broom)
read_csv(file = "https://filebin.net/w5cfuwztae7yk747/two.csv") %>%
group_by(Accession) %>%
do(broom::tidy(TukeyHSD(aov(value ~ genotype + Light + genotype:Light, data = .)))) %>%
ungroup
输出:
# A tibble: 264 x 7
Accession term comparison estimate conf.low conf.high adj.p.value
<chr> <fctr> <chr> <dbl> <dbl> <dbl> <dbl>
1 AT2G41680 genotype m-f -1586182.59 -3616647.7 444282.5 1.708496e-01
2 AT2G41680 genotype ntrc-f -5705550.95 -7694992.3 -3716109.6 2.609223e-08
3 AT2G41680 genotype WT-f -1568375.95 -3557817.3 421065.4 1.647950e-01
4 AT2G41680 genotype ntrc-m -4119368.37 -6149833.5 -2088903.3 2.214399e-05
5 AT2G41680 genotype WT-m 17806.64 -2012658.5 2048271.8 9.999951e-01
6 AT2G41680 genotype WT-ntrc 4137175.00 2147733.6 6126616.4 1.464605e-05
7 AT2G41680 Light HL-FL -3854435.85 -5849789.4 -1859082.3 4.872013e-05
8 AT2G41680 Light LL-FL 1528123.46 -467230.1 3523477.0 1.844033e-01
9 AT2G41680 Light ML-FL -2821752.94 -4775345.6 -868160.3 2.283331e-03
10 AT2G41680 Light LL-HL 5382559.31 3311883.1 7453235.6 2.176770e-07
# ... with 254 more rows