在 R 中从数据框中读取数据时如何执行 Wilcox 测试函数?
How to do Wilcox test function when reading data from dataframe, in R?
我正在尝试使此功能正常工作,但失败了。
我需要的是一个从数据框列中读取名称并使用它们对每个列执行 Wilcoxon 测试的函数。 “结果”将是主要的最终产品,一个 table 每行都有属名和它们的 p 值。我还添加了一个绘图功能,用于可视化每列组之间的值,我会在相应的属之后保存它们的命名。
library("dplyr")
library("ggpubr")
library(PairedData)
library(tidyr)
process <- function(data, genus){
group_by(data,group) %>%summarise(
count = n(),
median = median(genus, na.rm = TRUE),
IQR = IQR(genus, na.rm = TRUE)
)
# Subset data before and after treatment
T0 <- subset(data, group == "T0", genus,drop = TRUE)
T2 <- subset(data, group == "T2", genus,drop = TRUE)
#Wilcoxon test for paired data, I want a table of names and corresponding p-values
res <- wilcox.test(T0, T2, paired = TRUE)
res$p.value
result <- spread(genus,res$p.value)
# Plot paired data, with title depending on the data and its p-value (this last one could be optional)
pd <- paired(T0, T2)
tiff(genus".tiff", width = 600, height = 400)
plot(pd, type = "profile") + labs(title=print(data[,genus]", paired p-value="res[,p.value]) +theme_bw()
dev.off()
}
l <- length(my_data)
glist <- list(colnames(my_data[3:l])) #bacteria start at col 3
wilcoxon <- process(data = my_data, genus = glist)
可重现的数据集可以是
my_data
Patient group Subdoligranulum Agathobacter
pt_10T0 T0 0.02 0.00
pt_10T2 T2 10.71 19.89
pt_15T0 T0 29.97 0.28
pt_15T2 T2 16.10 7.70
pt_20T0 T0 2.39 0.44
pt_20T2 T2 20.48 3.35
pt_32T0 T0 12.23 0.17
pt_32T2 T2 37.11 1.87
pt_36T0 T0 0.64 0.03
pt_36T2 T2 0.02 0.08
pt_39T0 T0 0.04 0.01
pt_39T2 T2 0.36 0.05
pt_3t0 T0 13.23 1.34
pt_3T2 T2 19.22 1.51
pt_9T0 T0 11.69 0.57
pt_9T2 T2 34.56 3.52
我对函数不是很熟悉,还没有找到关于如何从数据框制作它们的好教程...所以这是我最好的尝试,我希望你们中的一些人能成功.
感谢您的帮助!
简单地说,return
处理结束时所需的值。下面没有测试绘图步骤(使用未知包),但针对正确的 R 语法进行了调整:
proc_wilcox <- function(data, genus){
# Subset data before and after treatment
T0 <- data[[genus]][data$group == "T0"]
T2 <- data[[genus]][data$group == "T2"]
# Wilcoxon test for paired data
res <- wilcox.test(T0, T2, paired = TRUE)
# Plot paired data, with title depending on the data and its p-value
# pd <- paired(T0, T2)
# tiff(paste0(genus, ".tiff"), width = 600, height = 400)
# plot(pd, type = "profile") +
# labs(title=paste0(genus, " paired p-value= ", res$p.value)) +
# theme_bw()
# dev.off()
return(res$p.value)
}
然后,使用应用函数调用方法,例如 sapply
或稍快的 vapply
设计用于跨迭代处理和 return 相同长度。
# VECTOR OF RESULTS (USING sapply)
wilcoxon_results <- sapply(
names(my_data)[3:ncol(my_data)],
function(col) proc_wilcox(my_data, col)
)
# VECTOR OF RESULTS (USING vapply)
wilcoxon_results <- vapply(
names(my_data)[3:ncol(my_data)],
function(col) proc_wilcox(my_data, col),
numeric(1)
)
wilcoxon_results
# Subdoligranulum Agathobacter
# 0.1484375 0.0078125
wilcoxon_df <- data.frame(wilcoxon_results)
wilcoxon_df
# wilcoxon_results
# Subdoligranulum 0.1484375
# Agathobacter 0.0078125
我正在尝试使此功能正常工作,但失败了。 我需要的是一个从数据框列中读取名称并使用它们对每个列执行 Wilcoxon 测试的函数。 “结果”将是主要的最终产品,一个 table 每行都有属名和它们的 p 值。我还添加了一个绘图功能,用于可视化每列组之间的值,我会在相应的属之后保存它们的命名。
library("dplyr")
library("ggpubr")
library(PairedData)
library(tidyr)
process <- function(data, genus){
group_by(data,group) %>%summarise(
count = n(),
median = median(genus, na.rm = TRUE),
IQR = IQR(genus, na.rm = TRUE)
)
# Subset data before and after treatment
T0 <- subset(data, group == "T0", genus,drop = TRUE)
T2 <- subset(data, group == "T2", genus,drop = TRUE)
#Wilcoxon test for paired data, I want a table of names and corresponding p-values
res <- wilcox.test(T0, T2, paired = TRUE)
res$p.value
result <- spread(genus,res$p.value)
# Plot paired data, with title depending on the data and its p-value (this last one could be optional)
pd <- paired(T0, T2)
tiff(genus".tiff", width = 600, height = 400)
plot(pd, type = "profile") + labs(title=print(data[,genus]", paired p-value="res[,p.value]) +theme_bw()
dev.off()
}
l <- length(my_data)
glist <- list(colnames(my_data[3:l])) #bacteria start at col 3
wilcoxon <- process(data = my_data, genus = glist)
可重现的数据集可以是
my_data
Patient group Subdoligranulum Agathobacter
pt_10T0 T0 0.02 0.00
pt_10T2 T2 10.71 19.89
pt_15T0 T0 29.97 0.28
pt_15T2 T2 16.10 7.70
pt_20T0 T0 2.39 0.44
pt_20T2 T2 20.48 3.35
pt_32T0 T0 12.23 0.17
pt_32T2 T2 37.11 1.87
pt_36T0 T0 0.64 0.03
pt_36T2 T2 0.02 0.08
pt_39T0 T0 0.04 0.01
pt_39T2 T2 0.36 0.05
pt_3t0 T0 13.23 1.34
pt_3T2 T2 19.22 1.51
pt_9T0 T0 11.69 0.57
pt_9T2 T2 34.56 3.52
我对函数不是很熟悉,还没有找到关于如何从数据框制作它们的好教程...所以这是我最好的尝试,我希望你们中的一些人能成功. 感谢您的帮助!
简单地说,return
处理结束时所需的值。下面没有测试绘图步骤(使用未知包),但针对正确的 R 语法进行了调整:
proc_wilcox <- function(data, genus){
# Subset data before and after treatment
T0 <- data[[genus]][data$group == "T0"]
T2 <- data[[genus]][data$group == "T2"]
# Wilcoxon test for paired data
res <- wilcox.test(T0, T2, paired = TRUE)
# Plot paired data, with title depending on the data and its p-value
# pd <- paired(T0, T2)
# tiff(paste0(genus, ".tiff"), width = 600, height = 400)
# plot(pd, type = "profile") +
# labs(title=paste0(genus, " paired p-value= ", res$p.value)) +
# theme_bw()
# dev.off()
return(res$p.value)
}
然后,使用应用函数调用方法,例如 sapply
或稍快的 vapply
设计用于跨迭代处理和 return 相同长度。
# VECTOR OF RESULTS (USING sapply)
wilcoxon_results <- sapply(
names(my_data)[3:ncol(my_data)],
function(col) proc_wilcox(my_data, col)
)
# VECTOR OF RESULTS (USING vapply)
wilcoxon_results <- vapply(
names(my_data)[3:ncol(my_data)],
function(col) proc_wilcox(my_data, col),
numeric(1)
)
wilcoxon_results
# Subdoligranulum Agathobacter
# 0.1484375 0.0078125
wilcoxon_df <- data.frame(wilcoxon_results)
wilcoxon_df
# wilcoxon_results
# Subdoligranulum 0.1484375
# Agathobacter 0.0078125