使用dplyr管道将for循环的输出提取到R中的数据帧中

Question

无法弄清楚如何在 for 循环中进行一系列 t 测试并在每次测试完成时获取输出并将结果附加到数据框。目标是一次运行多个 t 检验并生成所有结果的数据框。

这是对 mtcars 数据集的缓慢处理：

library(tidyverse)
library(rstatix)


# T-test to determine if there is a significant difference between mpg of 
# automatic vs manual transmissions (automatic=0, manual=1)
t1 <- mtcars %>% 
  t_test(mpg ~ am) %>% 
  mutate(var = "am") # add lable to merge by

# Calculate mean mpg of both groups
t1.1 <- mtcars %>% 
  group_by(am) %>% 
  summarize(Mean = mean(mpg, na.rm=TRUE)) %>% 
  pivot_wider(names_from = am, values_from = Mean) %>% # Bring to wide format to add to df
  mutate(var = "am") # add label to merge by

# T-test for vs (v-shape=0, straight line=1)
t2 <- mtcars %>% 
  t_test(mpg ~ vs) %>% 
  mutate(var = "vs") # add lable to merge by
# Calculate mean mpg of both groups
t2.1 <- mtcars %>% 
  group_by(vs) %>% 
  summarize(Mean = mean(mpg, na.rm=TRUE)) %>% 
  pivot_wider(names_from = vs, values_from = Mean) %>% # Bring to wide format to add to df
  mutate(var = "vs") # add label to merge by

# Merge dfs and rename
t_bind <- rbind(t1, t2)
t.1_bind <- rbind(t1.1, t2.1)
t.1_bind <- t.1_bind %>% rename("mean_0" = "0", "mean_1" = "1")
t_merge <- merge(t_bind, t.1_bind, by = "var")

但是当我尝试将其设置为循环时，我迷路了。看来这个应该还挺简单的，只是想不通而已

t_vars <- c("am", "vs")  # etc.

for (i in t_vars) {
  x1 <- mtcars %>% 
    t_test(mpg ~ i) %>% 
    mutate(var = colnames(mpg[[i]]))
  df <- append(x1)
}

# Error: Can't extract columns that don't exist.
# x Column `i` doesn't exist.

感谢您的帮助！！

Answer 1

是这样的吗？

bind_rows(lapply(c("am", "vs"), function(i) {
  mtcars %>% 
    t_test(formula(paste0("mpg ~ ",i)),detailed=T) %>% 
    mutate(var = i)
}))

输出：

# A tibble: 2 × 16
  estimate estimate1 estimate2 .y.   group1 group2    n1    n2 statistic       p    df conf.low conf.high method alternative var  
     <dbl>     <dbl>     <dbl> <chr> <chr>  <chr>  <int> <int>     <dbl>   <dbl> <dbl>    <dbl>     <dbl> <chr>  <chr>       <chr>
1    -7.24      17.1      24.4 mpg   0      1         19    13     -3.77 0.00137  18.3    -11.3     -3.21 T-test two.sided   am   
2    -7.94      16.6      24.6 mpg   0      1         18    14     -4.67 0.00011  22.7    -11.5     -4.42 T-test two.sided   vs

Answer 2

这里是在将数据放入长格式后使用 tidyverse nest_by 的替代方法：

library(tidyverse)
library(rstatix)

mtcars %>%
  pivot_longer(cols = c(am, vs)) %>%
  nest_by(name) %>%
  transmute(model = list(t_test(data = data, formula = mpg ~ value, detailed = T))) %>%
  unnest(model)

输出

  name  estimate estimate1 estimate2 .y.   group1 group2    n1    n2 statistic       p    df conf.low conf.high method alternative
  <chr>    <dbl>     <dbl>     <dbl> <chr> <chr>  <chr>  <int> <int>     <dbl>   <dbl> <dbl>    <dbl>     <dbl> <chr>  <chr>      
1 am       -7.24      17.1      24.4 mpg   0      1         19    13     -3.77 0.00137  18.3    -11.3     -3.21 T-test two.sided  
2 vs       -7.94      16.6      24.6 mpg   0      1         18    14     -4.67 0.00011  22.7    -11.5     -4.42 T-test two.sided

使用dplyr管道将for循环的输出提取到R中的数据帧中

Extracting outputs from for loops with dplyr pipes into dataframe in R

for-loop

r

dplyr