如何用 R 语言为不同的数据集编写函数式方差分析?
How to write a functional anova for different datasets in R language?
我想编写一个代码来执行方差分析,这样我就不必在每次更改数据集或更改用于解析的变量时都更改代码。因为我的代码是严格的,所以只有一种类型的数据集。例如;条件 A 的答案 ~ Factor1 + factor2 但如果我将条件更改为 B、C 等。我想最初进行参数化以避免硬编码。我怎样才能让我的代码更实用,更不死板?
示例
input <- data.frame(
order = gl(2,50, label = c(paste('area', LETTERS[1:2]))),
f1 = gl(5,10, label = c(paste('conc', LETTERS[1:5]))),
f2 = gl(5,2, label = c(paste('plot', LETTERS[1:5]))),
f3 = factor(rep(paste("cond", 1:2, sep =""), 5)),
values = abs(rnorm(100))
)
model <- by(input,input$order, function(x){
f1 = levels(factor(x$f1))
f2 = levels(factor(x$f2))
f3 = levels(factor(x$f3))
order = levels(factor(x$order))
for( i in (1: length(f1))){
for(j in (1: length(order))){
di <- x[x$f1 == f1[i] & x$order == order[j] ,]
write(paste('\nf1:', "f1", f1[i],order[j],'\n'), stderr())
anova.1 <- aov(values ~ f2 * f3, di)
print(summary(anova.1))
}
}
write("Analyse Finished! \n", stderr())
})
对于处理中的任何动态元素,只需将其存储为数据点即可。因此,考虑将不同的公式直接保存在相应的数据框中:
input <- data.frame(
order = gl(2,50, label = c(paste('area', LETTERS[1:2]))),
f1 = gl(5,10, label = c(paste('conc', LETTERS[1:5]))),
f2 = gl(5,2, label = c(paste('plot', LETTERS[1:5]))),
f3 = factor(rep(paste0("cond", 1:2), 5)), # USE paste0 WHEN sep=""
values = abs(rnorm(100)),
aov_formula = "values ~ f2 * f3", # NEW SCALAR COLUMN
stringsAsFactors = FALSE # AVOID STRING TO FACTORS
)
input
# order f1 f2 f3 values aov_formula
# 1 area A conc A plot A cond1 0.3673828 values ~ f2 * f3
# 2 area A conc A plot A cond2 1.5170873 values ~ f2 * f3
# 3 area A conc A plot B cond1 1.0677476 values ~ f2 * f3
# 4 area A conc A plot B cond2 1.7239528 values ~ f2 * f3
# 5 area A conc A plot C cond1 0.3247375 values ~ f2 * f3
# 6 area A conc A plot C cond2 0.7483484 values ~ f2 * f3
model <- by(input,input$order, function(x){
...
anova.1 <- aov(as.formula(x$aov_formula[1]), di) # STRING-FORMULA CONVERSION
...
})
我想编写一个代码来执行方差分析,这样我就不必在每次更改数据集或更改用于解析的变量时都更改代码。因为我的代码是严格的,所以只有一种类型的数据集。例如;条件 A 的答案 ~ Factor1 + factor2 但如果我将条件更改为 B、C 等。我想最初进行参数化以避免硬编码。我怎样才能让我的代码更实用,更不死板?
示例
input <- data.frame(
order = gl(2,50, label = c(paste('area', LETTERS[1:2]))),
f1 = gl(5,10, label = c(paste('conc', LETTERS[1:5]))),
f2 = gl(5,2, label = c(paste('plot', LETTERS[1:5]))),
f3 = factor(rep(paste("cond", 1:2, sep =""), 5)),
values = abs(rnorm(100))
)
model <- by(input,input$order, function(x){
f1 = levels(factor(x$f1))
f2 = levels(factor(x$f2))
f3 = levels(factor(x$f3))
order = levels(factor(x$order))
for( i in (1: length(f1))){
for(j in (1: length(order))){
di <- x[x$f1 == f1[i] & x$order == order[j] ,]
write(paste('\nf1:', "f1", f1[i],order[j],'\n'), stderr())
anova.1 <- aov(values ~ f2 * f3, di)
print(summary(anova.1))
}
}
write("Analyse Finished! \n", stderr())
})
对于处理中的任何动态元素,只需将其存储为数据点即可。因此,考虑将不同的公式直接保存在相应的数据框中:
input <- data.frame(
order = gl(2,50, label = c(paste('area', LETTERS[1:2]))),
f1 = gl(5,10, label = c(paste('conc', LETTERS[1:5]))),
f2 = gl(5,2, label = c(paste('plot', LETTERS[1:5]))),
f3 = factor(rep(paste0("cond", 1:2), 5)), # USE paste0 WHEN sep=""
values = abs(rnorm(100)),
aov_formula = "values ~ f2 * f3", # NEW SCALAR COLUMN
stringsAsFactors = FALSE # AVOID STRING TO FACTORS
)
input
# order f1 f2 f3 values aov_formula
# 1 area A conc A plot A cond1 0.3673828 values ~ f2 * f3
# 2 area A conc A plot A cond2 1.5170873 values ~ f2 * f3
# 3 area A conc A plot B cond1 1.0677476 values ~ f2 * f3
# 4 area A conc A plot B cond2 1.7239528 values ~ f2 * f3
# 5 area A conc A plot C cond1 0.3247375 values ~ f2 * f3
# 6 area A conc A plot C cond2 0.7483484 values ~ f2 * f3
model <- by(input,input$order, function(x){
...
anova.1 <- aov(as.formula(x$aov_formula[1]), di) # STRING-FORMULA CONVERSION
...
})