使用 data.frame 的列名称

Question

我有以下（简化的）文本文件，名为 datafile.txt:

Height Color Sales     
short blue 24    
short blue 25   
short red 31   
short red 28   
short black 35   
short black 32   
tall blue 31   
tall blue 32   
tall red 36   
tall red 32   
tall black 41   
tall black 36

从这个文本文件，我创建了 data.frame data:

data <- read.table("datafile.txt", header = TRUE)

通过以下行，我可以执行双向 ANOVA:

anova(lm(Sales ~ Height*Color, data))

但是，我希望执行双向 ANOVA 的以下代码不起作用：

columnNames <- names(data)    
anova(lm(columnNames[3] ~ columnNames[1]*columnNames[2], data))

我想使用从 data.frame 中提取的列名来执行分析，而不是直接键入 Sales、Height 和 Color。非常感谢您的帮助。

Answer 1

我们需要使用 paste 并转换为 formula

anova(lm(formula(paste(columnNames[3], "~",  columnNames[1], "*", columnNames[2])), data))

甚至不需要显式 formula

anova(lm(paste(columnNames[3], "~",  columnNames[1], "*", columnNames[2]), data))
#Analysis of Variance Table

#Response: Sales
#             Df Sum Sq Mean Sq F value   Pr(>F)   
#Height        1  90.75  90.750 17.8525 0.005529 **
#Color         2 128.17  64.083 12.6066 0.007103 **
#Height:Color  2   3.50   1.750  0.3443 0.721876   
#Residuals     6  30.50   5.083                    
#---
#Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

使用 data.frame 的列名称

Using column names of data.frame

r

anova