如何更改 S4 对象中的对象类型?

How do I change object types within an S4 object?

我正在尝试使用 R 中的包 anamiR(通过 Rstudio)执行 miRNA 相关。我使用的脚本是:

library(anamiR)

mrna1 = read.csv("D:\file1.csv", row.names = 1, header= TRUE)
mrna <- as.matrix(mrna1)
rm(mrna1) 
mirna1 = read.csv("D:\file2.csv", row.names = 1, header= TRUE)
mirna <- as.matrix(mirna1)
rm(mirna1) 
pheno.mirna1 = read.csv("D:\file3.csv", row.names = 1, header= TRUE)
pheno.mirna <- as.matrix(pheno.mirna1)
rm(pheno.mirna1) 
pheno.mrna1 = read.csv("D:\file4.csv", row.names = 1, header= TRUE)
pheno.mrna <- as.matrix(pheno.mrna1)
rm(pheno.mrna1)

mrna_se <- SummarizedExperiment::SummarizedExperiment(
  assays = S4Vectors::SimpleList(counts=mrna),
  colData = pheno.mrna)

mirna_se <- SummarizedExperiment::SummarizedExperiment(
  assays = S4Vectors::SimpleList(counts=mirna),
  colData = pheno.mirna)

mrna_d <- differExp_discrete(se = mrna_se,
                             class = "ER", method = "DESeq",
                             t_test.var = FALSE, log2 = FALSE,
                             p_value.cutoff = 0.05,  logratio = 0.5
)

mirna_d <- differExp_discrete(se = mirna_se,
                              class = "ER", method = "DESeq",
                              t_test.var = FALSE, log2 = FALSE,
                              p_value.cutoff = 0.05,  logratio = 0.5
)

当我到达时(这是产生错误的代码)。

mrna_d <- differExp_discrete(se = mrna_se,
                             class = "ER", method = "DESeq",
                             t_test.var = FALSE, log2 = FALSE,
                             p_value.cutoff = 0.05,  logratio = 0.5
)

mirna_d <- differExp_discrete(se = mirna_se,
                              class = "ER", method = "DESeq",
                              t_test.var = FALSE, log2 = FALSE,
                              p_value.cutoff = 0.05,  logratio = 0.5
)

我明白了

Error in model.matrix.formula(design(object), colData(object)) : 
  data must be a data.frame
In addition: Warning message:
In DESeq2::DESeqDataSet(se, design = tmp) :
  some variables in design formula are characters, converting to factors

我的会话信息是:

> sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252    LC_MONETARY=English_Australia.1252
[4] LC_NUMERIC=C                       LC_TIME=English_Australia.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] anamiR_1.13.0

loaded via a namespace (and not attached):
  [1] backports_1.2.1             Hmisc_4.5-0                 BiocFileCache_1.10.2        plyr_1.8.6                 
  [5] splines_3.6.0               BiocParallel_1.20.1         AlgDesign_1.2.0             GenomeInfoDb_1.22.1        
  [9] ggplot2_3.3.3               digest_0.6.27               foreach_1.5.1               htmltools_0.5.1.1          
 [13] fansi_0.4.2                 magrittr_2.0.1              checkmate_2.0.0             memoise_2.0.0              
 [17] cluster_2.1.1               limma_3.42.2                readr_1.4.0                 Biostrings_2.54.0          
 [21] annotate_1.64.0             matrixStats_0.58.0          askpass_1.1                 siggenes_1.60.0            
 [25] prettyunits_1.1.1           jpeg_0.1-8.1                colorspace_2.0-0            rappdirs_0.3.3             
 [29] blob_1.2.1                  haven_2.3.1                 xfun_0.22                   dplyr_1.0.5                
 [33] crayon_1.4.1                RCurl_1.98-1.3              graph_1.64.0                genefilter_1.68.0          
 [37] GEOquery_2.54.1             survival_3.2-10             iterators_1.0.13            glue_1.4.2                 
 [41] gtable_0.3.0                lumi_2.38.0                 zlibbioc_1.32.0             XVector_0.26.0             
 [45] DelayedArray_0.12.3         questionr_0.7.4             Rhdf5lib_1.8.0              BiocGenerics_0.32.0        
 [49] HDF5Array_1.14.4            scales_1.1.1                rngtools_1.5                DBI_1.1.1                  
 [53] miniUI_0.1.1.1              Rcpp_1.0.6                  progress_1.2.2              xtable_1.8-4               
 [57] htmlTable_2.1.0             gage_2.36.0                 bumphunter_1.28.0           foreign_0.8-71             
 [61] bit_4.0.4                   mclust_5.4.7                preprocessCore_1.48.0       Formula_1.2-4              
 [65] stats4_3.6.0                htmlwidgets_1.5.3           httr_1.4.2                  gplots_3.1.1               
 [69] RColorBrewer_1.1-2          ellipsis_0.3.1              pkgconfig_2.0.3             reshape_0.8.8              
 [73] XML_3.99-0.3                dbplyr_2.1.0                nnet_7.3-15                 locfit_1.5-9.4             
 [77] utf8_1.2.1                  tidyselect_1.1.0            rlang_0.4.10                later_1.1.0.1              
 [81] AnnotationDbi_1.48.0        munsell_0.5.0               tools_3.6.0                 cachem_1.0.4               
 [85] generics_0.1.0              RSQLite_2.2.5               stringr_1.4.0               fastmap_1.1.0              
 [89] knitr_1.31                  bit64_4.0.5                 beanplot_1.2                caTools_1.18.2             
 [93] methylumi_2.32.0            scrime_1.3.5                purrr_0.3.4                 KEGGREST_1.26.1            
 [97] doRNG_1.8.2                 nlme_3.1-152                mime_0.10                   nor1mix_1.3-0              
[101] xml2_1.3.2                  biomaRt_2.42.1              compiler_3.6.0              rstudioapi_0.13            
[105] curl_4.3                    png_0.1-7                   affyio_1.56.0               klaR_0.6-15                
[109] tibble_3.1.0                geneplotter_1.64.0          stringi_1.5.3               highr_0.8                  
[113] GenomicFeatures_1.38.2      minfi_1.32.0                forcats_0.5.1               lattice_0.20-41            
[117] Matrix_1.3-2                multtest_2.42.0             vctrs_0.3.7                 pillar_1.5.1               
[121] lifecycle_1.0.0             BiocManager_1.30.12         combinat_0.0-8              data.table_1.14.0          
[125] bitops_1.0-6                rtracklayer_1.46.0          httpuv_1.5.5                agricolae_1.3-3            
[129] GenomicRanges_1.38.0        affy_1.64.0                 R6_2.5.0                    latticeExtra_0.6-29        
[133] RMySQL_0.10.21              promises_1.2.0.1            KernSmooth_2.23-18          gridExtra_2.3              
[137] nleqslv_3.3.2               IRanges_2.20.2              codetools_0.2-18            MASS_7.3-53.1              
[141] gtools_3.8.2                assertthat_0.2.1            rhdf5_2.30.1                Sum
[145] openssl_1.4.3               DESeq2_1.26.0               GenomicAlignments_1.22.1    Rsamtools_2.2.3            
[149] S4Vectors_0.24.4            GenomeInfoDbData_1.2.2      mgcv_1.8-34                 parallel_3.6.0             
[153] hms_1.0.0                   quadprog_1.5-8              grid_3.6.0                  rpart_4.1-15               
[157] labelled_2.8.0              tidyr_1.1.3                 base64_2.0                  DelayedMatrixStats_1.8.0   
[161] illuminaio_0.28.0           Biobase_2.46.0              shiny_1.6.0                 base64enc_0.1-3        

我可以更改 R 版本,但这确实无济于事。我已将问题确定为 mrna_se@colData 和 miRNA@colData 都不是数据帧:

> is.data.frame(mirna_se@colData)
[1] FALSE
> is.data.frame(mrna_se@colData)
[1] FALSE

那么如何将整个 s4 对象中的这些对象转换为数据帧,以便 DESEQ2 可以使用它们来生成差异表达数据?这让我发疯。

另外在有人问之前:

> packageVersion("DESeq2")
[1] ‘1.26.0’

作为对评论的回应,我更改了如下代码,但出现了以下错误。

mrna_se <- SummarizedExperiment::SummarizedExperiment(
  assays = S4Vectors::SimpleList(counts=mrna),
  colData = as.data.frame(pheno.mrna))
  it appears that the last variable in the design formula, 'ER',
  has a factor level, 'control', which is not the reference level. we recommend
  to use factor(...,levels=...) or relevel() to set this as the reference level
  before proceeding. for more information, please see the 'Note on factor levels'
  in vignette('DESeq2').
Error in model.matrix.formula(design(object), colData(object)) : 
  data must be a data.frame

进一步编辑:

如果您尝试只读取 csv 文件而不将其作为矩阵读取,您会得到:

> library(anamiR)
> 
> mrna = read.csv("D:\file1.csv", row.names = 1, header= TRUE)
> mirna = read.csv("D:\file2.csv", row.names = 1, header= TRUE)
> pheno.mirna = read.csv("D:\file3.csv", row.names = 1, header= TRUE)
> pheno.mrna = read.csv("D:\file4.csv", row.names = 1, header= TRUE)
> 
> mrna_se <- SummarizedExperiment::SummarizedExperiment(
+   assays = S4Vectors::SimpleList(counts=mrna),
+   colData = as.data.frame(pheno.mrna))
Error in all_dims[, 1L] : incorrect number of dimensions
> 
> mirna_se <- SummarizedExperiment::SummarizedExperiment(
+   assays = S4Vectors::SimpleList(counts=mirna),
+   colData = pheno.mirna)
Error in all_dims[, 1L] : incorrect number of dimensions
> 
> mrna_d <- differExp_discrete(se = mrna_se,
+                              class = "ER", method = "DESeq",
+                              t_test.var = FALSE, log2 = FALSE,
+                              p_value.cutoff = 0.05,  logratio = 0.5
+ )
Error in SummarizedExperiment::assays(se) : object 'mrna_se' not found
> 
> mirna_d <- differExp_discrete(se = mirna_se,
+                               class = "ER", method = "DESeq",
+                               t_test.var = FALSE, log2 = FALSE,
+                               p_value.cutoff = 0.05,  logratio = 0.5
+ )
Error in SummarizedExperiment::assays(se) : object 'mirna_se' not found

我最初错误的回溯是(显示一个文件使用建议的 as.data.frame 解决方案,另一个文件使用我的原始矩阵加载):

> library(anamiR)
> 
> mrna1 = read.csv("D:\file1.csv.csv", row.names = 1, header= TRUE)
> mrna <- as.matrix(mrna1)
> rm(mrna1) 
> mirna1 = read.csv("D:\file2.csv.csv", row.names = 1, header= TRUE)
> mirna <- as.matrix(mirna1)
> rm(mirna1) 
> pheno.mirna1 = read.csv("D:\file3.csv.csv", row.names = 1, header= TRUE)
> pheno.mirna <- as.matrix(pheno.mirna1)
> rm(pheno.mirna1) 
> pheno.mrna1 = read.csv("D:\file4.csv.csv", row.names = 1, header= TRUE)
> pheno.mrna <- as.matrix(pheno.mrna1)
> rm(pheno.mrna1)
> 
> mrna_se <- SummarizedExperiment::SummarizedExperiment(
+   assays = S4Vectors::SimpleList(counts=mrna),
+   colData = as.data.frame(pheno.mrna))
> 
> mirna_se <- SummarizedExperiment::SummarizedExperiment(
+   assays = S4Vectors::SimpleList(counts=mirna),
+   colData = pheno.mirna)
> 
> mrna_d <- differExp_discrete(se = mrna_se,
+                              class = "ER", method = "DESeq",
+                              t_test.var = FALSE, log2 = FALSE,
+                              p_value.cutoff = 0.05,  logratio = 0.5
+ )
  it appears that the last variable in the design formula, 'ER',
  has a factor level, 'control', which is not the reference level. we recommend
  to use factor(...,levels=...) or relevel() to set this as the reference level
  before proceeding. for more information, please see the 'Note on factor levels'
  in vignette('DESeq2').
Error in model.matrix.formula(design(object), colData(object)) : 
  data must be a data.frame
> 
> traceback()
6: stop("data must be a data.frame")
5: model.matrix.formula(design(object), colData(object))
4: stats::model.matrix(design(object), colData(object))
3: designAndArgChecker(object, betaPrior)
2: DESeq2::DESeq(dds)
1: differExp_discrete(se = mrna_se, class = "ER", method = "DESeq", 
       t_test.var = FALSE, log2 = FALSE, p_value.cutoff = 0.05, 
       logratio = 0.5)
> 
> mirna_d <- differExp_discrete(se = mirna_se,
+                               class = "ER", method = "DESeq",
+                               t_test.var = FALSE, log2 = FALSE,
+                               p_value.cutoff = 0.05,  logratio = 0.5
+ )
Error in model.matrix.formula(design(object), colData(object)) : 
  data must be a data.frame
In addition: Warning message:
In DESeq2::DESeqDataSet(se, design = tmp) :
  some variables in design formula are characters, converting to factors
> 
> traceback()
6: stop("data must be a data.frame")
5: model.matrix.formula(design(object), colData(object))
4: stats::model.matrix(design(object), colData(object))
3: designAndArgChecker(object, betaPrior)
2: DESeq2::DESeq(dds)
1: differExp_discrete(se = mirna_se, class = "ER", method = "DESeq", 
       t_test.var = FALSE, log2 = FALSE, p_value.cutoff = 0.05, 
       logratio = 0.5)

新的回溯是:

> library(anamiR)
> 
> mrna = read.csv("D:\file1.csv", row.names = 1, header= TRUE)
> mirna = read.csv("D:\file2.csv", row.names = 1, header= TRUE)
> pheno.mirna = read.csv("D:\file3.csv", row.names = 1, header= TRUE)
> pheno.mrna = read.csv("D:\file4.csv", row.names = 1, header= TRUE)
> 
> mrna_se <- SummarizedExperiment::SummarizedExperiment(
+   assays = S4Vectors::SimpleList(counts=mrna),
+   colData = as.data.frame(pheno.mrna))
Error in all_dims[, 1L] : incorrect number of dimensions
> traceback()
7: method(object)
6: validityMethod(as(object, superClass))
5: isTRUE(x)
4: anyStrings(validityMethod(as(object, superClass)))
3: validObject(ans)
2: Assays(assays)
1: SummarizedExperiment::SummarizedExperiment(assays = S4Vectors::SimpleList(counts = mrna), 
       colData = as.data.frame(pheno.mrna))
> 
> mirna_se <- SummarizedExperiment::SummarizedExperiment(
+   assays = S4Vectors::SimpleList(counts=mirna),
+   colData = pheno.mirna)
Error in all_dims[, 1L] : incorrect number of dimensions
> traceback()
7: method(object)
6: validityMethod(as(object, superClass))
5: isTRUE(x)
4: anyStrings(validityMethod(as(object, superClass)))
3: validObject(ans)
2: Assays(assays)
1: SummarizedExperiment::SummarizedExperiment(assays = S4Vectors::SimpleList(counts = mirna), 
       colData = pheno.mirna)

最新编辑 (12/4/21)

因此,为了回应评论,我现在加载我的数据文件如下:

mrna <- as.matrix(read.csv("D:\CorrelationDataProcessing\TRAMP30w\mrnaTRAMP_Mut30w_v_WT30w_normcounts.csv", row.names = 1, header= TRUE))
mirna <- as.matrix(read.csv("D:\CorrelationDataProcessing\TRAMP30w\mirnaTRAMP_Mut30w_vs_WT30w_normcounts.csv", row.names = 1, header= TRUE))
pheno.mirna = read.csv("D:\CorrelationDataProcessing\TRAMP30w\mirnapheno.csv", row.names = 1, header= TRUE)
pheno.mrna = read.csv("D:\CorrelationDataProcessing\TRAMP30w\mrnapheno.csv", row.names = 1, header= TRUE)

这导致:

> mrna_d <- differExp_discrete(se = mrna_se,
+                              class = "ER", method = "DESeq",
+                              t_test.var = FALSE, log2 = FALSE,
+                              p_value.cutoff = 0.05,  logratio = 0.5
+ )
  it appears that the last variable in the design formula, 'ER',
  has a factor level, 'control', which is not the reference level. we recommend
  to use factor(...,levels=...) or relevel() to set this as the reference level
  before proceeding. for more information, please see the 'Note on factor levels'
  in vignette('DESeq2').
Error in model.matrix.formula(design(object), colData(object)) : 
  data must be a data.frame
> 
> mirna_d <- differExp_discrete(se = mirna_se,
+                               class = "ER", method = "DESeq",
+                               t_test.var = FALSE, log2 = FALSE,
+                               p_value.cutoff = 0.05,  logratio = 0.5
+ )
  it appears that the last variable in the design formula, 'ER',
  has a factor level, 'control', which is not the reference level. we recommend
  to use factor(...,levels=...) or relevel() to set this as the reference level
  before proceeding. for more information, please see the 'Note on factor levels'
  in vignette('DESeq2').
Error in model.matrix.formula(design(object), colData(object)) : 
  data must be a data.frame

不幸的是,这个包似乎有问题,无论如何都被弃用了。在这一点上,我建议任何想要进行 miRNA 关联的人都忽略 anaMIR 和 mirCOMB。