如何根据 Pr(>F) 筛选模型列表

How to filter a list of models based on Pr(>F)

我运行一个模型基本上就是这个

   models <- mclapply(frms, function(x) anova(lm(x, data = mrna.pcs)))

现在我想根据 Pr(>F)

过滤模型

型号class是list

这是模特str

str(models)
List of 248
 $ PC1 ~ Sex                            :Classes ‘anova’ and 'data.frame':  2 obs. of  5 variables:
  ..$ Df     : int [1:2] 1 186
  ..$ Sum Sq : num [1:2] 205562 63770
  ..$ Mean Sq: num [1:2] 205562 343
  ..$ F value: num [1:2] 600 NA
  ..$ Pr(>F) : num [1:2] 4.34e-60 NA
  ..- attr(*, "heading")= chr [1:2] "Analysis of Variance Table\n" "Response: PC1"
 $ PC2 ~ Sex                            :Classes ‘anova’ and 'data.frame':  2 obs. of  5 variables:
  ..$ Df     : int [1:2] 1 186
  ..$ Sum Sq : num [1:2] 1098 185549
  ..$ Mean Sq: num [1:2] 1098 998
  ..$ F value: num [1:2] 1.1 NA
  ..$ Pr(>F) : num [1:2] 0.296 NA
  ..- attr(*, "heading")= chr [1:2] "Analysis of Variance Table\n" "Response: PC2"
 $ PC3 ~ Sex                            :Classes ‘anova’ and 'data.frame':  2 obs. of  5 variables:
  ..$ Df     : int [1:2] 1 186
  ..$ Sum Sq : num [1:2] 6023 56650
  ..$ Mean Sq: num [1:2] 6023 305
  ..$ F value: num [1:2] 19.8 NA
  ..$ Pr(>F) : num [1:2] 1.5e-05 NA
  ..- attr(*, "heading")= chr [1:2] "Analysis of Variance Table\n" "Response: PC3"
 $ PC4 ~ Sex                            :Classes ‘anova’ and 'data.frame':  2 obs. of  5 variables:
  ..$ Df     : int [1:2] 1 186
  ..$ Sum Sq : num [1:2] 88.1 48006.7
  ..$ Mean Sq: num [1:2] 88.1 258.1
  ..$ F value: num [1:2] 0.341 NA
  ..$ Pr(>F) : num [1:2] 0.56 NA
  ..- attr(*, "heading")= chr [1:2] "Analysis of Variance Table\n" "Response: PC4"
 $ PC5 ~ Sex                            :Classes ‘anova’ and 'data.frame':  2 obs. of  5 variables:
  ..$ Df     : int [1:2] 1 186
  ..$ Sum Sq : num [1:2] 390 31192
  ..$ Mean Sq: num [1:2] 390 168
  ..$ F value: num [1:2] 2.33 NA
  ..$ Pr(>F) : num [1:2] 0.129 NA
  ..- attr(*, "heading")= chr [1:2] "Analysis of Variance Table\n" "Response: PC5"
 $ PC6 ~ Sex                            :Classes ‘anova’ and 'data.frame':  2 obs. of  5 variables:
  ..$ Df     : int [1:2] 1 186
  ..$ Sum Sq : num [1:2] 58.3 24470
  ..$ Mean Sq: num [1:2] 58.3 131.6
  ..$ F value: num [1:2] 0.443 NA
  ..$ Pr(>F) : num [1:2] 0.506 NA
  ..- attr(*, "heading")= chr [1:2] "Analysis of Variance Table\n" "Response: PC6"
 $ PC7 ~ Sex                            :Classes ‘anova’ and 'data.frame':  2 obs. of  5 variables:
  ..$ Df     : int [1:2] 1 186
  ..$ Sum Sq : num [1:2] 21.9 19772.5
  ..$ Mean Sq: num [1:2] 21.9 106.3
  ..$ F value: num [1:2] 0.206 NA
  ..$ Pr(>F) : num [1:2] 0.65 NA
  ..- attr(*, "heading")= chr [1:2] "Analysis of Variance Table\n" "Response: PC7"
 $ PC8 ~ Sex                            :Classes ‘anova’ and 'data.frame':  2 obs. of  5 variables:
  ..$ Df     : int [1:2] 1 186
  ..$ Sum Sq : num [1:2] 7.39 17396.15
  ..$ Mean Sq: num [1:2] 7.39 93.53
  ..$ F value: num [1:2] 0.0791 NA
  ..$ Pr(>F) : num [1:2] 0.779 NA
  ..- attr(*, "heading")= chr [1:2] "Analysis of Variance Table\n" "Response: PC8"
 $ PC1 ~ fAge                           :Classes ‘anova’ and 'data.frame':  2 obs. of  5 variables:
  ..$ Df     : int [1:2] 2 185
  ..$ Sum Sq : num [1:2] 717 268616
  ..$ Mean Sq: num [1:2] 358 1452
  ..$ F value: num [1:2] 0.247 NA
  ..$ Pr(>F) : num [1:2] 0.782 NA
  ..- attr(*, "heading")= chr [1:2] "Analysis of Variance Table\n" "Response: PC1"
 $ PC2 ~ fAge                           :Classes ‘anova’ and 'data.frame':  2 obs. of  5 variables:
  ..$ Df     : int [1:2] 2 185
  ..$ Sum Sq : num [1:2] 238 186409
  ..$ Mean Sq: num [1:2] 119 1008
  ..$ F value: num [1:2] 0.118 NA
  ..$ Pr(>F) : num [1:2] 0.889 NA
  ..- attr(*, "heading")= chr [1:2] "Analysis of Variance Table\n" "Response: PC2"
 $ PC3 ~ fAge                           :Classes ‘anova’ and 'data.frame':  2 obs. of  5 variables:
  ..$ Df     : int [1:2] 2 185
  ..$ Sum Sq : num [1:2] 5461 57211
  ..$ Mean Sq: num [1:2] 2731 309
  ..$ F value: num [1:2] 8.83 NA
  ..$ Pr(>F) : num [1:2] 0.000218 NA
  ..- attr(*, "heading")= chr [1:2] "Analysis of Variance Table\n" "Response: PC3"
 $ PC4 ~ fAge                           :Classes ‘anova’ and 'data.frame':  2 obs. of  5 variables:
  ..$ Df     : int [1:2] 2 185
  ..$ Sum Sq : num [1:2] 3845 44250
  ..$ Mean Sq: num [1:2] 1922 239
  ..$ F value: num [1:2] 8.04 NA
  ..$ Pr(>F) : num [1:2] 0.00045 NA
  ..- attr(*, "heading")= chr [1:2] "Analysis of Variance Table\n" "Response: PC4"
 $ PC5 ~ fAge                           :Classes ‘anova’ and 'data.frame':  2 obs. of  5 variables:
  ..$ Df     : int [1:2] 2 185
  ..$ Sum Sq : num [1:2] 1804 29778
  ..$ Mean Sq: num [1:2] 902 161
  ..$ F value: num [1:2] 5.61 NA
  ..$ Pr(>F) : num [1:2] 0.00433 NA
  ..- attr(*, "heading")= chr [1:2] "Analysis of Variance Table\n" "Response: PC5"
 $ PC6 ~ fAge                           :Classes ‘anova’ and 'data.frame':  2 obs. of  5 variables:
  ..$ Df     : int [1:2] 2 185
  ..$ Sum Sq : num [1:2] 7.65 24520.66
  ..$ Mean Sq: num [1:2] 3.82 132.54
  ..$ F value: num [1:2] 0.0288 NA
  ..$ Pr(>F) : num [1:2] 0.972 NA
  ..- attr(*, "heading")= chr [1:2] "Analysis of Variance Table\n" "Response: PC6"
 $ PC7 ~ fAge                           :Classes ‘anova’ and 'data.frame':  2 obs. of  5 variables:
  ..$ Df     : int [1:2] 2 185
  ..$ Sum Sq : num [1:2] 378 19416
  ..$ Mean Sq: num [1:2] 189 105
  ..$ F value: num [1:2] 1.8 NA
  ..$ Pr(>F) : num [1:2] 0.168 NA
  ..- attr(*, "heading")= chr [1:2] "Analysis of Variance Table\n" "Response: PC7"
 $ PC8 ~ fAge                           :Classes ‘anova’ and 'data.frame':  2 obs. of  5 variables:
  ..$ Df     : int [1:2] 2 185
  ..$ Sum Sq : num [1:2] 239 17165
  ..$ Mean Sq: num [1:2] 119.4 92.8
  ..$ F value: num [1:2] 1.29 NA
  ..$ Pr(>F) : num [1:2] 0.279 NA
  ..- attr(*, "heading")= chr [1:2] "Analysis of Variance Table\n" "Response: PC8"
 $ PC1 ~ Index                          :Classes ‘anova’ and 'data.frame':  2 obs. of  5 variables:
  ..$ Df     : int [1:2] 23 164
  ..$ Sum Sq : num [1:2] 30056 239277
  ..$ Mean Sq: num [1:2] 1307 1459
  ..$ F value: num [1:2] 0.896 NA
  ..$ Pr(>F) : num [1:2] 0.604 NA
  ..- attr(*, "heading")= chr [1:2] "Analysis of Variance Table\n" "Response: PC1"
 $ PC2 ~ Index                          :Classes ‘anova’ and 'data.frame':  2 obs. of  5 variables:
  ..$ Df     : int [1:2] 23 164
  ..$ Sum Sq : num [1:2] 8402 178245
  ..$ Mean Sq: num [1:2] 365 1087
  ..$ F value: num [1:2] 0.336 NA
  ..$ Pr(>F) : num [1:2] 0.998 NA
  ..- attr(*, "heading")= chr [1:2] "Analysis of Variance Table\n" "Response: PC2"

我尝试过滤的是基于pvalue的模型是这样的

myp2<-lapply(models,function(x)x$"Pr(>F)")

myp2<-lapply(models,function(x)x$"Pr(>F)" < 0.05,) this gives me this 


Error in FUN(X[[i]], ...) : unused argument (alist()) I know the code is not right.

我的问题是如何将 pvalue 参数传递到只能过滤重要模型的模型中

如有任何建议或帮助,我们将不胜感激。

我的示例数据子集

structure(list(Mouse.ID = c("DO.0661", "DO.0669", "DO.0670", 
"DO.0673", "DO.0674", "DO.0676", "DO.0677", "DO.0682", "DO.0683", 
"DO.0685", "DO.0686", "DO.0692", "DO.0693", "DO.0696", "DO.0698", 
"DO.0701", "DO.0704", "DO.0709", "DO.0710", "DO.0711"), Sex = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L), .Label = c("F", "M"), class = "factor"), fAge = structure(c(2L, 
3L, 2L, 3L, 2L, 2L, 2L, 2L, 3L, 3L, 2L, 3L, 3L, 2L, 2L, 3L, 2L, 
3L, 3L, 2L), .Label = c("6", "12", "18"), class = "factor"), 
    Index = structure(c(21L, 24L, 11L, 20L, 12L, 19L, 20L, 7L, 
    1L, 7L, 6L, 15L, 19L, 23L, 14L, 17L, 8L, 22L, 13L, 12L), .Label = c("AR001", 
    "AR002", "AR003", "AR004", "AR005", "AR006", "AR007", "AR008", 
    "AR009", "AR010", "AR011", "AR012", "AR013", "AR014", "AR015", 
    "AR016", "AR018", "AR019", "AR020", "AR021", "AR022", "AR023", 
    "AR025", "AR027"), class = "factor"), Lane = structure(c(6L, 
    2L, 4L, 5L, 5L, 4L, 8L, 8L, 8L, 4L, 2L, 2L, 1L, 1L, 2L, 3L, 
    7L, 4L, 8L, 1L), .Label = c("1", "2", "3", "4", "5", "6", 
    "7", "8"), class = "factor"), Gen = structure(c(1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L), .Label = c("8", "9", "10", "11", "12"), class = "factor"), 
    PC1 = c(-23.147618298858, -23.004329868562, -17.0024755772689, 
    -23.9178589007844, -56.7766982399411, -34.3969872418573, 
    -27.7082679050298, -34.32038042076, -6.54582754257061, -48.2738527700051, 
    -51.350816410461, -23.1430204310663, -44.8168212771171, -34.9912596308964, 
    -57.2869816005964, -35.9007859727558, -13.396023721849, -70.4151952469644, 
    -3.95389163762967, -35.2820334506896), PC2 = c(40.5243564641241, 
    2.99206119995141, -61.4176842149059, 7.10965422446634, 7.28461966315024, 
    -64.1955797075099, 9.48345862615554, -1.04318789593829, 29.0090598234213, 
    -72.8866334170873, -3.21615600827421, 0.792597778173725, 
    -5.14192513442733, -11.7269589504179, 6.55428703944617, -11.5180102658871, 
    33.3869522894233, -35.1229326772949, 15.996339264987, -11.8901043502155
    ), PC3 = c(-17.0598627155672, -22.1038475592448, -6.25238299099893, 
    23.500307567532, 53.4553992426852, -20.1077749520339, -11.8816581457792, 
    -5.73256447673161, -22.0636009501435, 0.688509203223446, 
    16.5309171320498, -19.983643792547, -9.04327584423542, -2.27657333476154, 
    37.6402580806145, 3.45415683648683, -32.247947130388, 64.7524458379641, 
    -22.9483534394309, -12.2002153235215), PC4 = c(-5.37605681469604, 
    28.8757760174757, 1.96723351126677, 10.1757811517044, 7.63553142427313, 
    -0.61083387825962, -2.14595568269526, 6.96007000414511, -5.55019443290321, 
    10.7590865244751, -10.6766589136731, 2.57313118560919, -3.80955622632714, 
    -3.66495004673328, 21.0056059162486, -6.43937479210278, -9.20567548365632, 
    16.1413805847049, 4.77454270484041, 2.14994000686116), PC5 = c(2.49156058897602, 
    -2.2801673669604, -5.45494631567109, -5.44682692111089, -7.21616736676726, 
    -11.0786655194642, 3.89806778409165, 6.1416402328447, -7.6800051817927, 
    -1.30037456136107, -3.73786692756896, -19.2389148951544, 
    9.07153121652293, -10.2899662479029, 0.579736383131339, -0.0725346819879087, 
    16.3956001897781, -12.6980354901866, 2.24690751602866, 26.4308764499693
    ), PC6 = c(-11.625850369587, 1.54093546690149, -4.87370378395642, 
    -22.0735137415442, -2.44337914021456, 0.619440592140127, 
    10.0537326783752, 4.27431733991133, 13.6314815937122, 4.15399959062463, 
    -10.1029165139482, 3.79816714568195, 11.054055138545, -8.56784129106846, 
    -16.5277734318821, -11.1264688073482, -10.4604427054892, 
    -9.80324924496993, -6.23395120489922, 11.8384546696797), 
    PC7 = c(7.20873385839409, -17.719801994905, -0.811301497692041, 
    7.55418040146638, -4.68437054723712, 1.1158744957288, -15.1982758555559, 
    5.25257260525755, -8.31670233486223, -3.86077542839162, -5.29923744674506, 
    -16.0223534779217, -18.0399629122521, -17.9420689996937, 
    -14.3059444168904, -14.3249976727842, 12.4030641816896, 0.629537064989641, 
    1.01109826318526, -5.35255467845748), PC8 = c(-7.19678837565302, 
    6.24827779166403, 0.224651092284126, 6.10960416152842, -14.6615234719377, 
    -0.410198021192528, 10.3006326038467, 7.37866876496142, -12.4177204278112, 
    -11.712973299024, -2.00299875954171, 3.19952937463445, 8.81158436770453, 
    11.7845383750873, -4.79906390420115, 9.7890992316383, -6.26723664234847, 
    -3.97353277391602, -7.12621186623398, -7.33366271961528)), row.names = c(NA, 
-20L), class = c("tbl_df", "tbl", "data.frame"))

生成模型的代码

iv <- c("Sex", "fAge", "Index", "Lane", "Gen")  

dv <- paste0('PC', 1:8)
rhs <- unlist(sapply(1:length(iv), function(m) apply(combn(iv, m = m), 2, paste, collapse = ' * ')))
frms <- with(expand.grid(dv, rhs), paste(Var1, Var2, sep = ' ~ '))
frms
models <- mclapply(frms, function(x) anova(lm(x, data = mrna.pcs)))

你或许可以使用

pvals <- sapply(models,function(x)x[["Pr(>F)"]][1])
models[pvals < 0.05]