从 R 中的 chisq 输出中提取元素

Extracting elements from a chisq output in R

library(survey)

我正在使用调查包生成两个分类变量之间的 P 值和 Chisq。我想 运行 一次对许多变量进行卡方检验并将数据提取到数据框中。

我有这样的数据。

df <- data.frame(sex = c('F', 'M', NA, 'M', 'M', 'M', 'F', 'F'),
                     happy = c('Y', 'Y','Y','Y','N','N','N','N'),
                     married = c(1,1,1,1,0,0,1,1),
                     pens = c(0, 1, 1, NA, 1, 1, 0, 0),
                     weight = c(1.12, 0.55, 1.1, 0.6, 0.23, 0.23, 0.66, 0.67))

我运行以下代码创建调查设计:

design <- svydesign(ids=~1, data=df, weights=~weight)

求性别和笔数的卡方:

svychisq(~sex+pens, design, statistic = "Chisq")

    Pearson's X^2: Rao & Scott adjustment

data:  svychisq(~sex + pens, design, statistic = "Chisq")
X-squared = 8, df = 1, p-value = 1.319e-08

我的实际数据集非常大,我想找到许多变量(在本例中为 sex 和 happy)的 chisq 并将输出输出到一个整洁的 df 中,如下所示:

Question  Group    Chisq  Pval
sex       pens     78     0.001
sex       married  45     0.100
happy     pens     34     0.3
happy     married  87     2.0

这是我目前拥有的:

vector_vars <- c("sex", "happy") 
myfun <- function(x){
  form <- reformulate(sprintf('interaction(%s)', x))
  all <- as.data.frame(svychisq(form + pens, design, statistic = "Chisq"))
  stat <- all$statistic # get the chi sq val
  p <- all$p.value  # get the p val
  cbind(as.data.frame(stat,p))
}


out_df <- do.call(rbind, lapply(vector_vars, myfun))

我收到这个错误:

  Error in terms(formula) : object 'pens' not found  

我认为我没有正确提取元素。任何建议表示赞赏。

将您的 svychisq 分配给一个对象。 然后检查 names() 并使用 test$p.value 获取 p.value 或从您需要的名称中选择 你的情况 test$statistic

test <- svychisq(~sex+pens, design, statistic = "Chisq")
names(test)
#test$p.value
test$statistic

# Output:
> test <- svychisq(~sex+pens, design, statistic = "Chisq")
> names(test)
[1] "statistic" "parameter" "p.value"   "method"    "data.name" "observed"  "expected"  "residuals" "stdres"   
> test$p.value
   X-squared 
1.319262e-08 

> test$statistic
X-squared 
        8 

函数中的reformulate可以通过将termlabels指定为带有循环变量名称的'pens'的向量来更改,然后将该公式传递给svychisq , 使用 tidy 将输出转换为 tibblerbind tibblelist 到单个 tibble

myfun <- function(x){
 form <- reformulate(termlabels = c('pens', x))
 all <- broom::tidy(svychisq(form, design, statistic = "Chisq")) %>% 
           dplyr::mutate(var_name = x, .before = 1)
 
}

purrr::map_dfr(vector_vars, myfun)
# A tibble: 2 x 5
#  var_name statistic      p.value parameter method                               
#  <chr>        <dbl>        <dbl>     <int> <chr>                                
#1 sex          8.    0.0000000132         1 Pearson's X^2: Rao & Scott adjustment
#2 happy        0.880 0.383                1 Pearson's X^2: Rao & Scott adjustment

使用 base R 你可以做:

out_df <- do.call(rbind, lapply(vector_vars, 
    function(x){with(svychisq(reformulate(termlabels = c('pens', x)),
        design, statistic = "Chisq"), 
        data.frame(stat=statistic, p=p.value, row.names = x))}))