运行 R t.test 在嵌套循环中
Running R t.test in nested loops
我是 R Studio 新手。对于 class,我已经提取了美国人口普查 2016 年选举数据集,并希望对该数据集进行 运行 一系列 T 检验。数据集的一些细节。首先,数据被编码 - 1 到 4 - 代表公民状态。我想看看是否有各种因素影响投票的可能性(1=是或 2=否)。
代码如下:
factor <- c("Age", "Fathers_country_of_birth", "Mothers_country_of_birth","Highest_level_of_School_completed", "Country_of_birth")
citizen <- c("NATIVE, BORN IN THE UNITED STATES", "NATIVE, BORN IN PUERTO RICO OR OTHER U.S. ISLAND AREAS", "NATIVE, BORN ABROAD OF AMERICAN PARENT OR PARENTS", "FOREIGN BORN, U.S. CITIZEN BY NATURALIZATION")
for (f in factor) {
print(f)
for (i in 1:4){
print(paste("Citizenship is", citizen[i] ))
query <- paste("select * from result2 where Citizenship = ",i)
sample <- sqldf(query)
print(
(t.test(f ~ Vote_in_Election, data=sample, var.equal = FALSE) ) )
} }
它抛出 'variable lengths' 错误
> [1] "Age" [1] "Citizenship is NATIVE, BORN IN THE UNITED STATES" Show
> Traceback Error in model.frame.default(formula = f ~ Vote_in_Election,
> data = sample) : variable lengths differ (found for
> 'Vote_in_Election')
如果我去掉外循环我可以运行就好了,我必须一个一个地输入'factor'中的值,当然。
运行 R Studio 版本 1.1.463,R 在 Windows 10 上是 3.5.2。
因为迭代i的时候会有不同行的数据,我试过设置paired = FALSE,它还是对我大吼大叫。
我已经查看了 SO 但没有找到解决方案。我错过了什么?
要动态构建公式,您需要在 as.formula
:
中转换字符串版本的公式
t.test(as.formula(paste(f, "~ Vote_in_Election")), data=sample, var.equal = FALSE)
或使用reformulate
:
t.test(reformulate("Vote_in_Election", response=f), data=sample, var.equal = FALSE)
我是 R Studio 新手。对于 class,我已经提取了美国人口普查 2016 年选举数据集,并希望对该数据集进行 运行 一系列 T 检验。数据集的一些细节。首先,数据被编码 - 1 到 4 - 代表公民状态。我想看看是否有各种因素影响投票的可能性(1=是或 2=否)。
代码如下:
factor <- c("Age", "Fathers_country_of_birth", "Mothers_country_of_birth","Highest_level_of_School_completed", "Country_of_birth")
citizen <- c("NATIVE, BORN IN THE UNITED STATES", "NATIVE, BORN IN PUERTO RICO OR OTHER U.S. ISLAND AREAS", "NATIVE, BORN ABROAD OF AMERICAN PARENT OR PARENTS", "FOREIGN BORN, U.S. CITIZEN BY NATURALIZATION")
for (f in factor) {
print(f)
for (i in 1:4){
print(paste("Citizenship is", citizen[i] ))
query <- paste("select * from result2 where Citizenship = ",i)
sample <- sqldf(query)
print(
(t.test(f ~ Vote_in_Election, data=sample, var.equal = FALSE) ) )
} }
它抛出 'variable lengths' 错误
> [1] "Age" [1] "Citizenship is NATIVE, BORN IN THE UNITED STATES" Show
> Traceback Error in model.frame.default(formula = f ~ Vote_in_Election,
> data = sample) : variable lengths differ (found for
> 'Vote_in_Election')
如果我去掉外循环我可以运行就好了,我必须一个一个地输入'factor'中的值,当然。
运行 R Studio 版本 1.1.463,R 在 Windows 10 上是 3.5.2。
因为迭代i的时候会有不同行的数据,我试过设置paired = FALSE,它还是对我大吼大叫。
我已经查看了 SO 但没有找到解决方案。我错过了什么?
要动态构建公式,您需要在 as.formula
:
t.test(as.formula(paste(f, "~ Vote_in_Election")), data=sample, var.equal = FALSE)
或使用reformulate
:
t.test(reformulate("Vote_in_Election", response=f), data=sample, var.equal = FALSE)