组内 R 中的一个样本 T-TEST
One sample T-TEST IN R within groups
我一直在用 R 做一个样本 t 检验,但今天我遇到了一个很大的挑战。我有按一个变量分组的数据,我想对每组执行一个样本 t 检验。我可以在 SPSS 中完美地做到这一点,但它现在在 R 中令人头疼,谁知道如何在 assist.Sample 场景
中做到这一点
Location=rep(c("Area_A","Area_B"),4)
temp=rnorm(length(Location),34,5)
sample_data=data.frame(Location,ph)
sample_data
Location temp
1 Area_A 32.73782
2 Area_B 26.29996
3 Area_A 40.75101
4 Area_B 26.68309
5 Area_A 33.94259
6 Area_B 26.48326
7 Area_A 37.92506
8 Area_B 29.22532
假设上面例子中的假设均值是35,单样本t检验就是,
t.test(sample_data$temp,mu=35)
这给了我
One Sample t-test
data: sample_data$ph
t = -1.6578, df = 7, p-value = 0.1413
alternative hypothesis: true mean is not equal to 35
95 percent confidence interval:
27.12898 36.38304
sample estimates:
mean of x
31.75601
但这是所有组的总和。我可以在 SPSS 中完成。有没有什么方法可以用一行代码在 R 中做到这一点,或者如果不可能用一行代码,谁可以为我做这件事。提前致谢。
一个解决方案是将每组 t.test 个结果保存为列表:
# reproducible results
set.seed(8)
# example data
Location=rep(c("Area_A","Area_B"),4)
temp=rnorm(length(Location),34,5)
sample_data=data.frame(Location,temp)
library(dplyr)
dt_res = sample_data %>%
group_by(Location) %>% # for each group
summarise(res = list(t.test(temp, mu=35))) # run t.test and save results as a list
# see the list of results
dt_res$res
# [[1]]
#
# One Sample t-test
#
# data: temp
# t = -0.76098, df = 3, p-value = 0.502
# alternative hypothesis: true mean is not equal to 35
# 95 percent confidence interval:
# 29.93251 38.11170
# sample estimates:
# mean of x
# 34.0221
#
#
# [[2]]
#
# One Sample t-test
#
# data: temp
# t = -1.045, df = 3, p-value = 0.3728
# alternative hypothesis: true mean is not equal to 35
# 95 percent confidence interval:
# 26.37007 39.36331
# sample estimates:
# mean of x
# 32.86669
另一种解决方案是将每组 t.test 个结果保存为数据框:
library(dplyr)
library(tidyr)
library(broom)
sample_data %>%
group_by(Location) %>%
summarise(res = list(tidy(t.test(temp, mu=35)))) %>%
unnest()
# # A tibble: 2 x 9
# Location estimate statistic p.value parameter conf.low conf.high method alternative
# <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
# 1 Area_A 34.0 -0.761 0.502 3 29.9 38.1 One Sample t-test two.sided
# 2 Area_B 32.9 -1.05 0.373 3 26.4 39.4 One Sample t-test two.sided
两种方法的理念是相同的。您按 Location
分组并为每个组执行 t.test。这完全取决于您喜欢什么样的输出。
我一直在用 R 做一个样本 t 检验,但今天我遇到了一个很大的挑战。我有按一个变量分组的数据,我想对每组执行一个样本 t 检验。我可以在 SPSS 中完美地做到这一点,但它现在在 R 中令人头疼,谁知道如何在 assist.Sample 场景
中做到这一点Location=rep(c("Area_A","Area_B"),4)
temp=rnorm(length(Location),34,5)
sample_data=data.frame(Location,ph)
sample_data
Location temp
1 Area_A 32.73782
2 Area_B 26.29996
3 Area_A 40.75101
4 Area_B 26.68309
5 Area_A 33.94259
6 Area_B 26.48326
7 Area_A 37.92506
8 Area_B 29.22532
假设上面例子中的假设均值是35,单样本t检验就是,
t.test(sample_data$temp,mu=35)
这给了我
One Sample t-test
data: sample_data$ph
t = -1.6578, df = 7, p-value = 0.1413
alternative hypothesis: true mean is not equal to 35
95 percent confidence interval:
27.12898 36.38304
sample estimates:
mean of x
31.75601
但这是所有组的总和。我可以在 SPSS 中完成。有没有什么方法可以用一行代码在 R 中做到这一点,或者如果不可能用一行代码,谁可以为我做这件事。提前致谢。
一个解决方案是将每组 t.test 个结果保存为列表:
# reproducible results
set.seed(8)
# example data
Location=rep(c("Area_A","Area_B"),4)
temp=rnorm(length(Location),34,5)
sample_data=data.frame(Location,temp)
library(dplyr)
dt_res = sample_data %>%
group_by(Location) %>% # for each group
summarise(res = list(t.test(temp, mu=35))) # run t.test and save results as a list
# see the list of results
dt_res$res
# [[1]]
#
# One Sample t-test
#
# data: temp
# t = -0.76098, df = 3, p-value = 0.502
# alternative hypothesis: true mean is not equal to 35
# 95 percent confidence interval:
# 29.93251 38.11170
# sample estimates:
# mean of x
# 34.0221
#
#
# [[2]]
#
# One Sample t-test
#
# data: temp
# t = -1.045, df = 3, p-value = 0.3728
# alternative hypothesis: true mean is not equal to 35
# 95 percent confidence interval:
# 26.37007 39.36331
# sample estimates:
# mean of x
# 32.86669
另一种解决方案是将每组 t.test 个结果保存为数据框:
library(dplyr)
library(tidyr)
library(broom)
sample_data %>%
group_by(Location) %>%
summarise(res = list(tidy(t.test(temp, mu=35)))) %>%
unnest()
# # A tibble: 2 x 9
# Location estimate statistic p.value parameter conf.low conf.high method alternative
# <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
# 1 Area_A 34.0 -0.761 0.502 3 29.9 38.1 One Sample t-test two.sided
# 2 Area_B 32.9 -1.05 0.373 3 26.4 39.4 One Sample t-test two.sided
两种方法的理念是相同的。您按 Location
分组并为每个组执行 t.test。这完全取决于您喜欢什么样的输出。