有问题 运行 组比较 RMANOVA 的 Shapiro-Wilks 检验
Having issues running group comparison Shapiro-Wilks test for RMANOVA
我目前正在使用 datarium 包中的“weightloss”数据集来启动 运行 RMANOVA。这是输出:
dput(head(weightloss))
structure(list(id = structure(1:6, .Label = c("1", "2", "3",
"4", "5", "6", "7", "8", "9", "10", "11", "12"), class = "factor"),
diet = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("no",
"yes"), class = "factor"), exercises = structure(c(1L, 1L,
1L, 1L, 1L, 1L), .Label = c("no", "yes"), class = "factor"),
t1 = c(10.43, 11.59, 11.35, 11.12, 9.5, 9.5), t2 = c(13.21,
10.66, 11.12, 9.5, 9.73, 12.74), t3 = c(11.59, 13.21, 11.35,
11.12, 12.28, 10.43)), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
这是我到目前为止想出的脚本:
# Create Data Frame for Dataset:
weight <- weightloss
weight
# Pivot Longer Data to Create Factors and Scores:
weight <- weight %>%
pivot_longer(names_to = 'trial', # creates factor (x)
values_to = 'value', # creates value (y)
cols = t1:t3) # finds which cols to factor
# Plot Means in Boxplot:
ggplot(weight,
aes(x=trial,y=value))+
geom_boxplot()+
labs(title = "Trial Means") # As can be predicted, inc w/time
我得到了这个看起来很正常的箱线图:
现在是时候找出异常值并测试正态性了。
# Identify Outliers (Should be None Given Boxplot):
outlier <- weight %>%
group_by(trial) %>%
identify_outliers(value)
outlier_frame <- data.frame(outlier)
outlier_frame # none found :)
# Normality (Shapiro-Wilk and QQPlot):
model <- lm(value~trial,
data = weight) # creates model
shapiro_test(residuals(model)) # measures Shapiro
ggqqplot(residuals(model))+
labs(title = "QQ Plot of Residuals") # creates QQ
这又给了我一个非常正常的 QQplot:
然后我通过试验包装了数据:
ggqqplot(weight, "value", ggtheme = theme_bw())+
facet_wrap(~trial)+
labs(title = "QQPlot of Each Trial") #looks normal
据我所知:
但是,当我尝试按组进行 Shapiro Wilk 测试时,我一直遇到此代码的问题:
shapiro_group <- weight %>%
group_by(trial) %>%
shapiro_test(value)
它给我这个错误:
Error: Problem with mutate()
column data
. i data = map(.data$data, .f, ...)
. x Must group by variables found in .data
.
- Column
variable
is not found.
我也试过这个:
shapiro_test(weight, trial$value)
并得到这个错误:
Error: Can't subset columns that don't exist. x Column trial$value
doesn't exist.
如果有人知道原因,我将不胜感激!
您收到 shapiro_test
错误的原因是它的实现中有这一行。
shapiro_test
function (data, ..., vars = NULL)
{
....
....
data <- data %>% gather(key = "variable", value = "value") %>%
filter(!is.na(value))
....
....
}
它使用 gather
以长格式获取数据。因为您已经有一个名为 value
的列,所以这不起作用。
如果您将 value
列的名称更改为其他名称,它会起作用。
library(dplyr)
library(rstatix)
weight %>%
rename(value1 = value) %>%
group_by(trial) %>%
shapiro_test(value1)
# trial variable statistic p
# <chr> <chr> <dbl> <dbl>
#1 t1 value1 0.869 0.222
#2 t2 value1 0.910 0.440
#3 t3 value1 0.971 0.897
我目前正在使用 datarium 包中的“weightloss”数据集来启动 运行 RMANOVA。这是输出:
dput(head(weightloss))
structure(list(id = structure(1:6, .Label = c("1", "2", "3",
"4", "5", "6", "7", "8", "9", "10", "11", "12"), class = "factor"),
diet = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("no",
"yes"), class = "factor"), exercises = structure(c(1L, 1L,
1L, 1L, 1L, 1L), .Label = c("no", "yes"), class = "factor"),
t1 = c(10.43, 11.59, 11.35, 11.12, 9.5, 9.5), t2 = c(13.21,
10.66, 11.12, 9.5, 9.73, 12.74), t3 = c(11.59, 13.21, 11.35,
11.12, 12.28, 10.43)), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
这是我到目前为止想出的脚本:
# Create Data Frame for Dataset:
weight <- weightloss
weight
# Pivot Longer Data to Create Factors and Scores:
weight <- weight %>%
pivot_longer(names_to = 'trial', # creates factor (x)
values_to = 'value', # creates value (y)
cols = t1:t3) # finds which cols to factor
# Plot Means in Boxplot:
ggplot(weight,
aes(x=trial,y=value))+
geom_boxplot()+
labs(title = "Trial Means") # As can be predicted, inc w/time
我得到了这个看起来很正常的箱线图:
现在是时候找出异常值并测试正态性了。
# Identify Outliers (Should be None Given Boxplot):
outlier <- weight %>%
group_by(trial) %>%
identify_outliers(value)
outlier_frame <- data.frame(outlier)
outlier_frame # none found :)
# Normality (Shapiro-Wilk and QQPlot):
model <- lm(value~trial,
data = weight) # creates model
shapiro_test(residuals(model)) # measures Shapiro
ggqqplot(residuals(model))+
labs(title = "QQ Plot of Residuals") # creates QQ
这又给了我一个非常正常的 QQplot:
然后我通过试验包装了数据:
ggqqplot(weight, "value", ggtheme = theme_bw())+
facet_wrap(~trial)+
labs(title = "QQPlot of Each Trial") #looks normal
据我所知:
但是,当我尝试按组进行 Shapiro Wilk 测试时,我一直遇到此代码的问题:
shapiro_group <- weight %>%
group_by(trial) %>%
shapiro_test(value)
它给我这个错误:
Error: Problem with
mutate()
columndata
. idata = map(.data$data, .f, ...)
. x Must group by variables found in.data
.
- Column
variable
is not found.
我也试过这个:
shapiro_test(weight, trial$value)
并得到这个错误:
Error: Can't subset columns that don't exist. x Column
trial$value
doesn't exist.
如果有人知道原因,我将不胜感激!
您收到 shapiro_test
错误的原因是它的实现中有这一行。
shapiro_test
function (data, ..., vars = NULL)
{
....
....
data <- data %>% gather(key = "variable", value = "value") %>%
filter(!is.na(value))
....
....
}
它使用 gather
以长格式获取数据。因为您已经有一个名为 value
的列,所以这不起作用。
如果您将 value
列的名称更改为其他名称,它会起作用。
library(dplyr)
library(rstatix)
weight %>%
rename(value1 = value) %>%
group_by(trial) %>%
shapiro_test(value1)
# trial variable statistic p
# <chr> <chr> <dbl> <dbl>
#1 t1 value1 0.869 0.222
#2 t2 value1 0.910 0.440
#3 t3 value1 0.971 0.897