p.value 从应用于函数 coxph 的 lapply 函数过滤
p.value filtration from an lapply-function applied for the function coxph
我是运行 566 个基因的每个表达水平的生存分析。我通过将函数 coxph()
与函数 lapply
组合来完成此操作,并且效果很好。现在,由于考虑的基因数量众多,我一直在研究如何进行 P 值过滤,以便仅保留具有显着存活率的基因,即当 P<0.05 时。
这是虚拟数据:
df1 = structure(list(ERLIN2 = structure(c(`TCGA-A1-A0SE-01` = 1L, `TCGA-A1-A0SH-01` = 1L,
`TCGA-A1-A0SJ-01` = 1L), .Label = c("down", "up"), class = "factor"),
BRF2 = structure(c(`TCGA-A1-A0SE-01` = 2L, `TCGA-A1-A0SH-01` = 1L,
`TCGA-A1-A0SJ-01` = 2L), .Label = c("down", "up"), class = "factor"),
ZNF703 = structure(c(`TCGA-A1-A0SE-01` = 2L, `TCGA-A1-A0SH-01` = 1L,
`TCGA-A1-A0SJ-01` = 2L), .Label = c("down", "up"), class = "factor"),
time = c(43.4, 47.21, 13.67), event = c(0, 0, 0)), row.names = c("TCGA-A1-A0SE-01",
"TCGA-A1-A0SH-01", "TCGA-A1-A0SJ-01"), class = "data.frame")
之后,要接收结果,请输入以下代码行:
#library
if(!require(survival)) install.packages('survival')
library('survival')
#run survival analysis
df2=lapply(c("ERLIN2", "BRF2", "ZNF703"),
function(x) {
formula <- as.formula(paste('Surv(time,event)~',as.factor(x)))
coxFit <- coxph(formula, data = df1)
summary(coxFit)
})
从这里开始,我尝试按如下方式进行 P 值过滤:
for (i in 3){
df2 = df2 %>% subset(df2[[i]]$logtest[3] < 0.05)
}
但是效率很低!如有任何帮助,我们将不胜感激!
如果您有兴趣通过任何变量(在您的情况下为 logtest 的 pvalue)对列表进行子设置,我会建议 rlist
包
library(rlist)
df3 <- list.filter(df2, logtest[["pvalue"]] < 0.05)
这将按指定的条件过滤列表。条件也可以嵌套。
我是运行 566 个基因的每个表达水平的生存分析。我通过将函数 coxph()
与函数 lapply
组合来完成此操作,并且效果很好。现在,由于考虑的基因数量众多,我一直在研究如何进行 P 值过滤,以便仅保留具有显着存活率的基因,即当 P<0.05 时。
这是虚拟数据:
df1 = structure(list(ERLIN2 = structure(c(`TCGA-A1-A0SE-01` = 1L, `TCGA-A1-A0SH-01` = 1L,
`TCGA-A1-A0SJ-01` = 1L), .Label = c("down", "up"), class = "factor"),
BRF2 = structure(c(`TCGA-A1-A0SE-01` = 2L, `TCGA-A1-A0SH-01` = 1L,
`TCGA-A1-A0SJ-01` = 2L), .Label = c("down", "up"), class = "factor"),
ZNF703 = structure(c(`TCGA-A1-A0SE-01` = 2L, `TCGA-A1-A0SH-01` = 1L,
`TCGA-A1-A0SJ-01` = 2L), .Label = c("down", "up"), class = "factor"),
time = c(43.4, 47.21, 13.67), event = c(0, 0, 0)), row.names = c("TCGA-A1-A0SE-01",
"TCGA-A1-A0SH-01", "TCGA-A1-A0SJ-01"), class = "data.frame")
之后,要接收结果,请输入以下代码行:
#library
if(!require(survival)) install.packages('survival')
library('survival')
#run survival analysis
df2=lapply(c("ERLIN2", "BRF2", "ZNF703"),
function(x) {
formula <- as.formula(paste('Surv(time,event)~',as.factor(x)))
coxFit <- coxph(formula, data = df1)
summary(coxFit)
})
从这里开始,我尝试按如下方式进行 P 值过滤:
for (i in 3){
df2 = df2 %>% subset(df2[[i]]$logtest[3] < 0.05)
}
但是效率很低!如有任何帮助,我们将不胜感激!
如果您有兴趣通过任何变量(在您的情况下为 logtest 的 pvalue)对列表进行子设置,我会建议 rlist
包
library(rlist)
df3 <- list.filter(df2, logtest[["pvalue"]] < 0.05)
这将按指定的条件过滤列表。条件也可以嵌套。