我将如何使用方差分析找出数据集中三组均值之间的差异
How would I use ANOVA to find the differences between means of three groups in a dataset
我正在尝试使用方差分析找出以下数据集中 'group1'、'group2' 和 'group5' 均值之间的差异。
tab_csv <- read.csv("data.csv", sep = "\t", header = TRUE)
tab_csv
label number
1 group1 120
2 group1 105
3 group1 105
4 group1 84
5 group1 32
6 group2 820
7 group2 922
8 group2 823
9 group2 945
10 group2 849
11 group3 1990
12 group3 29
13 group3 40
14 group3 21
15 group3 900
16 group4 220
17 group4 70
18 group4 109
19 group4 19
20 group4 18
21 group5 55
22 group5 40
23 group5 35
24 group5 30
25 group5 20
levels(tab_csv$label)
[1] "group1" "group2" "group3" "group4" "group5"
我已经开始尝试这个但我不确定...
tab_csv$number[tab_csv$label == "group1"])
tab_csv$number[tab_csv$label == "group2"])
tab_csv$number[tab_csv$label == "group5"])
有人可以帮忙吗?
你可以这样做:
groups<-c("group1","group2","group5")
new.df<-tab_csv[which(levels(tab_csv$label)%in%groups,]
m1<-aov(new.df$number~new.df$label)
summary(m1)
或者您可以从原始 data.frame 和 运行 中提取以下内容:
m2<-aov(tab_csv$number[which(tab_csv$label%in%groups)]~tab_csv$label[which(tab_csv$label%in%groups)])
summary(m2)
可能是更漂亮的方法...
这应该是您要找的东西?
newdf <- df %>%
filter(label %in% c("group1","group2","group5"))
myaov <- aov(number ~ label, data = newdf)
您还可以使用 subset
来 select 您数据框中的组:
# 1st - generating your dataframe
group = c(rep("group1",5),rep("group2",5),rep("group3",5),rep("group4",5),rep("group5",5))
value = c(120,105,105,84,32,820,922,823,945,849,1990,29,40,21,900,220,70,109,19,18,55,40,35,30,20)
df = data.frame(group = group,value = value)
# performing anova
> summary(aov(value ~ group, data = subset(df, group == "group1" | group =="group2" | group == "group5")))
Df Sum Sq Mean Sq F value Pr(>F)
group 2 2189758 1094879 695.9 3.9e-13 ***
Residuals 12 18880 1573
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
我正在尝试使用方差分析找出以下数据集中 'group1'、'group2' 和 'group5' 均值之间的差异。
tab_csv <- read.csv("data.csv", sep = "\t", header = TRUE)
tab_csv
label number
1 group1 120
2 group1 105
3 group1 105
4 group1 84
5 group1 32
6 group2 820
7 group2 922
8 group2 823
9 group2 945
10 group2 849
11 group3 1990
12 group3 29
13 group3 40
14 group3 21
15 group3 900
16 group4 220
17 group4 70
18 group4 109
19 group4 19
20 group4 18
21 group5 55
22 group5 40
23 group5 35
24 group5 30
25 group5 20
levels(tab_csv$label)
[1] "group1" "group2" "group3" "group4" "group5"
我已经开始尝试这个但我不确定...
tab_csv$number[tab_csv$label == "group1"])
tab_csv$number[tab_csv$label == "group2"])
tab_csv$number[tab_csv$label == "group5"])
有人可以帮忙吗?
你可以这样做:
groups<-c("group1","group2","group5")
new.df<-tab_csv[which(levels(tab_csv$label)%in%groups,]
m1<-aov(new.df$number~new.df$label)
summary(m1)
或者您可以从原始 data.frame 和 运行 中提取以下内容:
m2<-aov(tab_csv$number[which(tab_csv$label%in%groups)]~tab_csv$label[which(tab_csv$label%in%groups)])
summary(m2)
可能是更漂亮的方法...
这应该是您要找的东西?
newdf <- df %>%
filter(label %in% c("group1","group2","group5"))
myaov <- aov(number ~ label, data = newdf)
您还可以使用 subset
来 select 您数据框中的组:
# 1st - generating your dataframe
group = c(rep("group1",5),rep("group2",5),rep("group3",5),rep("group4",5),rep("group5",5))
value = c(120,105,105,84,32,820,922,823,945,849,1990,29,40,21,900,220,70,109,19,18,55,40,35,30,20)
df = data.frame(group = group,value = value)
# performing anova
> summary(aov(value ~ group, data = subset(df, group == "group1" | group =="group2" | group == "group5")))
Df Sum Sq Mean Sq F value Pr(>F)
group 2 2189758 1094879 695.9 3.9e-13 ***
Residuals 12 18880 1573
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1