我如何计算 R 中的治疗和未治疗
How do I count treated and untreated in R
我正在尝试再次学习 R 并尝试计算 bioconductor 气道数据集中使用 dex“处理”和“未处理”的基因总数。 (https://bioconductor.org/packages/release/data/experiment/html/airway.html).
我正在尝试:
airway$dex=='trted'
#[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
它不起作用。
使用sum()
函数统计真值:
sum(airway$dex=='trted')
安装该软件包后,我在我的控制台上执行了以下操作(包括所有输出):
> library(airway)
Loading required package: SummarizedExperiment
Loading required package: MatrixGenerics
Loading required package: matrixStats
Attaching package: ‘matrixStats’
The following object is masked from ‘package:dplyr’:
count
Attaching package: ‘MatrixGenerics’
The following objects are masked from ‘package:matrixStats’:
colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse, colCounts, colCummaxs, colCummins,
colCumprods, colCumsums, colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs, colMads, colMaxs,
colMeans2, colMedians, colMins, colOrderStats, colProds, colQuantiles, colRanges, colRanks, colSdDiffs,
colSds, colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads, colWeightedMeans,
colWeightedMedians, colWeightedSds, colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods, rowCumsums, rowDiffs, rowIQRDiffs,
rowIQRs, rowLogSumExps, rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins, rowOrderStats,
rowProds, rowQuantiles, rowRanges, rowRanks, rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs,
rowVars, rowWeightedMads, rowWeightedMeans, rowWeightedMedians, rowWeightedSds, rowWeightedVars
Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:parallel’:
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply,
parCapply, parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from ‘package:bit64’:
match, order, rank
The following objects are masked from ‘package:dplyr’:
combine, intersect, setdiff, union
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, append, as.data.frame, basename, cbind, colnames, dirname, do.call, duplicated, eval,
evalq, Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget, order,
paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort,
table, tapply, union, unique, unsplit, which.max, which.min
Loading required package: S4Vectors
Attaching package: ‘S4Vectors’
The following object is masked from ‘package:Matrix’:
expand
The following objects are masked from ‘package:data.table’:
first, second
The following objects are masked from ‘package:tidygraph’:
active, rename
The following object is masked from ‘package:tidyr’:
expand
The following objects are masked from ‘package:dplyr’:
first, rename
The following object is masked from ‘package:base’:
expand.grid
Loading required package: IRanges
Attaching package: ‘IRanges’
The following object is masked from ‘package:data.table’:
shift
The following object is masked from ‘package:nlme’:
collapse
The following object is masked from ‘package:tidygraph’:
slice
The following object is masked from ‘package:purrr’:
reduce
The following objects are masked from ‘package:dplyr’:
collapse, desc, slice
Loading required package: GenomeInfoDb
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Attaching package: ‘Biobase’
The following object is masked from ‘package:MatrixGenerics’:
rowMedians
The following objects are masked from ‘package:matrixStats’:
anyMissing, rowMedians
The following object is masked from ‘package:bit64’:
cache
Attaching package: ‘SummarizedExperiment’
The following object is masked from ‘package:SeuratObject’:
Assays
The following object is masked from ‘package:Seurat’:
Assays
我查看了帮助页面
> help(pac=airway)
所以在阅读之后我认为 airway
数据集可能可以访问,但是没有:
> str(airway)
Error in str(airway) : object 'airway' not found
所以我尝试用data
函数加载它(没有报错)所以我查看了它的结构:
> data(airway)
> str(airway)
Formal class 'RangedSummarizedExperiment' [package "SummarizedExperiment"] with 6 slots
..@ rowRanges :Formal class 'GRangesList' [package "GenomicRanges"] with 3 slots
.. .. ..@ elementMetadata:Formal class 'DataFrame' [package "IRanges"] with 6 slots
.. .. .. .. ..@ rownames : NULL
.. .. .. .. ..@ nrows : int 64102
.. .. .. .. ..@ listData : Named list()
.. .. .. .. ..@ elementType : chr "ANY"
.. .. .. .. ..@ elementMetadata: NULL
.. .. .. .. ..@ metadata : list()
.. .. ..@ elementType : chr "GRanges"
.. .. ..@ metadata :List of 1
.. .. .. ..$ genomeInfo:List of 20
.. .. .. .. ..$ Db type : chr "TranscriptDb"
.. .. .. .. ..$ Supporting package : chr "GenomicFeatures"
.. .. .. .. ..$ Data source : chr "BioMart"
.. .. .. .. ..$ Organism : chr "Homo sapiens"
.. .. .. .. ..$ Resource URL : chr "www.biomart.org:80"
.. .. .. .. ..$ BioMart database : chr "ensembl"
.. .. .. .. ..$ BioMart database version : chr "ENSEMBL GENES 75 (SANGER UK)"
.. .. .. .. ..$ BioMart dataset : chr "hsapiens_gene_ensembl"
.. .. .. .. ..$ BioMart dataset description : chr "Homo sapiens genes (GRCh37.p13)"
.. .. .. .. ..$ BioMart dataset version : chr "GRCh37.p13"
.. .. .. .. ..$ Full dataset : chr "yes"
.. .. .. .. ..$ miRBase build ID : chr NA
.. .. .. .. ..$ transcript_nrow : chr "215647"
.. .. .. .. ..$ exon_nrow : chr "745593"
.. .. .. .. ..$ cds_nrow : chr "537555"
.. .. .. .. ..$ Db created by : chr "GenomicFeatures package from Bioconductor"
.. .. .. .. ..$ Creation time : chr "2014-07-10 14:55:55 -0400 (Thu, 10 Jul 2014)"
.. .. .. .. ..$ GenomicFeatures version at creation time: chr "1.17.9"
.. .. .. .. ..$ RSQLite version at creation time : chr "0.11.4"
.. .. .. .. ..$ DBSCHEMAVERSION : chr "1.0"
..@ colData :Formal class 'DataFrame' [package "IRanges"] with 6 slots
.. .. ..@ rownames : chr [1:8] "SRR1039508" "SRR1039509" "SRR1039512" "SRR1039513" ...
.. .. ..@ nrows : int 8
.. .. ..@ listData :List of 9
.. .. .. ..$ SampleName: Factor w/ 8 levels "GSM1275862","GSM1275863",..: 1 2 3 4 5 6 7 8
.. .. .. ..$ cell : Factor w/ 4 levels "N052611","N061011",..: 4 4 1 1 3 3 2 2
.. .. .. ..$ dex : Factor w/ 2 levels "trt","untrt": 2 1 2 1 2 1 2 1
.. .. .. ..$ albut : Factor w/ 1 level "untrt": 1 1 1 1 1 1 1 1
.. .. .. ..$ Run : Factor w/ 8 levels "SRR1039508","SRR1039509",..: 1 2 3 4 5 6 7 8
.. .. .. ..$ avgLength : int [1:8] 126 126 126 87 120 126 101 98
.. .. .. ..$ Experiment: Factor w/ 8 levels "SRX384345","SRX384346",..: 1 2 3 4 5 6 7 8
.. .. .. ..$ Sample : Factor w/ 8 levels "SRS508567","SRS508568",..: 2 1 3 4 5 6 7 8
.. .. .. ..$ BioSample : Factor w/ 8 levels "SAMN02422669",..: 1 4 6 2 7 3 8 5
.. .. ..@ elementType : chr "ANY"
.. .. ..@ elementMetadata: NULL
.. .. ..@ metadata : list()
..@ assays :Reference class 'ShallowSimpleListAssays' [package "GenomicRanges"] with 1 field
.. ..$ data:Formal class 'SimpleList' [package "IRanges"] with 4 slots
.. .. .. ..@ listData :List of 1
.. .. .. .. ..$ counts: int [1:64102, 1:8] 679 0 467 260 60 0 3251 1433 519 394 ...
.. .. .. ..@ elementType : chr "ANY"
.. .. .. ..@ elementMetadata: NULL
.. .. .. ..@ metadata : list()
.. ..and 12 methods.
..@ NAMES : NULL
..@ elementMetadata:Formal class 'DataFrame' [package "S4Vectors"] with 6 slots
.. .. ..@ rownames : NULL
.. .. ..@ nrows : int 64102
.. .. ..@ listData : Named list()
.. .. ..@ elementType : chr "ANY"
.. .. ..@ elementMetadata: NULL
.. .. ..@ metadata : list()
..@ metadata :List of 1
.. ..$ :Formal class 'MIAME' [package "Biobase"] with 13 slots
.. .. .. ..@ name : chr "Himes BE"
.. .. .. ..@ lab : chr NA
.. .. .. ..@ contact : chr ""
.. .. .. ..@ title : chr "RNA-Seq transcriptome profiling identifies CRISPLD2 as a glucocorticoid responsive gene that modulates cytokine"| __truncated__
.. .. .. ..@ abstract : chr "Asthma is a chronic inflammatory respiratory disease that affects over 300 million people worldwide. Glucocorti"| __truncated__
.. .. .. ..@ url : chr "http://www.ncbi.nlm.nih.gov/pubmed/24926665"
.. .. .. ..@ pubMedIds : chr "24926665"
.. .. .. ..@ samples : list()
.. .. .. ..@ hybridizations : list()
.. .. .. ..@ normControls : list()
.. .. .. ..@ preprocessing : list()
.. .. .. ..@ other : list()
.. .. .. ..@ .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slot
.. .. .. .. .. ..@ .Data:List of 2
.. .. .. .. .. .. ..$ : int [1:3] 1 0 0
.. .. .. .. .. .. ..$ : int [1:3] 1 1 0
扫描 S4
结构化数据列表,我看到了这一行:
.. .. .. ..$ dex : Factor w/ 2 levels "trt","untrt": 2 1 2 1 2 1 2 1
因此 dex
项确实具有“trt”和“untrt”作为值,但该“列”在整个 DesignedExperiment
结构中位于更深的位置。可能有一个特定的功能,我不知道它的名字,从这样的结构中提取值,但我们现在有足够的信息来回答(或破解)这个问题。按照该嵌套列表中的名称和运算符向后返回其原点,并使用 S4 提取运算符:在适当的地方使用“@”,在不适当的地方使用 $
:
sum( airway@ colData @ listData $ dex == "trt")
#[1] 4
我正在尝试再次学习 R 并尝试计算 bioconductor 气道数据集中使用 dex“处理”和“未处理”的基因总数。 (https://bioconductor.org/packages/release/data/experiment/html/airway.html).
我正在尝试:
airway$dex=='trted'
#[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
它不起作用。
使用sum()
函数统计真值:
sum(airway$dex=='trted')
安装该软件包后,我在我的控制台上执行了以下操作(包括所有输出):
> library(airway)
Loading required package: SummarizedExperiment
Loading required package: MatrixGenerics
Loading required package: matrixStats
Attaching package: ‘matrixStats’
The following object is masked from ‘package:dplyr’:
count
Attaching package: ‘MatrixGenerics’
The following objects are masked from ‘package:matrixStats’:
colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse, colCounts, colCummaxs, colCummins,
colCumprods, colCumsums, colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs, colMads, colMaxs,
colMeans2, colMedians, colMins, colOrderStats, colProds, colQuantiles, colRanges, colRanks, colSdDiffs,
colSds, colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads, colWeightedMeans,
colWeightedMedians, colWeightedSds, colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods, rowCumsums, rowDiffs, rowIQRDiffs,
rowIQRs, rowLogSumExps, rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins, rowOrderStats,
rowProds, rowQuantiles, rowRanges, rowRanks, rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs,
rowVars, rowWeightedMads, rowWeightedMeans, rowWeightedMedians, rowWeightedSds, rowWeightedVars
Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:parallel’:
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply,
parCapply, parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from ‘package:bit64’:
match, order, rank
The following objects are masked from ‘package:dplyr’:
combine, intersect, setdiff, union
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, append, as.data.frame, basename, cbind, colnames, dirname, do.call, duplicated, eval,
evalq, Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget, order,
paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort,
table, tapply, union, unique, unsplit, which.max, which.min
Loading required package: S4Vectors
Attaching package: ‘S4Vectors’
The following object is masked from ‘package:Matrix’:
expand
The following objects are masked from ‘package:data.table’:
first, second
The following objects are masked from ‘package:tidygraph’:
active, rename
The following object is masked from ‘package:tidyr’:
expand
The following objects are masked from ‘package:dplyr’:
first, rename
The following object is masked from ‘package:base’:
expand.grid
Loading required package: IRanges
Attaching package: ‘IRanges’
The following object is masked from ‘package:data.table’:
shift
The following object is masked from ‘package:nlme’:
collapse
The following object is masked from ‘package:tidygraph’:
slice
The following object is masked from ‘package:purrr’:
reduce
The following objects are masked from ‘package:dplyr’:
collapse, desc, slice
Loading required package: GenomeInfoDb
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Attaching package: ‘Biobase’
The following object is masked from ‘package:MatrixGenerics’:
rowMedians
The following objects are masked from ‘package:matrixStats’:
anyMissing, rowMedians
The following object is masked from ‘package:bit64’:
cache
Attaching package: ‘SummarizedExperiment’
The following object is masked from ‘package:SeuratObject’:
Assays
The following object is masked from ‘package:Seurat’:
Assays
我查看了帮助页面
> help(pac=airway)
所以在阅读之后我认为 airway
数据集可能可以访问,但是没有:
> str(airway)
Error in str(airway) : object 'airway' not found
所以我尝试用data
函数加载它(没有报错)所以我查看了它的结构:
> data(airway)
> str(airway)
Formal class 'RangedSummarizedExperiment' [package "SummarizedExperiment"] with 6 slots
..@ rowRanges :Formal class 'GRangesList' [package "GenomicRanges"] with 3 slots
.. .. ..@ elementMetadata:Formal class 'DataFrame' [package "IRanges"] with 6 slots
.. .. .. .. ..@ rownames : NULL
.. .. .. .. ..@ nrows : int 64102
.. .. .. .. ..@ listData : Named list()
.. .. .. .. ..@ elementType : chr "ANY"
.. .. .. .. ..@ elementMetadata: NULL
.. .. .. .. ..@ metadata : list()
.. .. ..@ elementType : chr "GRanges"
.. .. ..@ metadata :List of 1
.. .. .. ..$ genomeInfo:List of 20
.. .. .. .. ..$ Db type : chr "TranscriptDb"
.. .. .. .. ..$ Supporting package : chr "GenomicFeatures"
.. .. .. .. ..$ Data source : chr "BioMart"
.. .. .. .. ..$ Organism : chr "Homo sapiens"
.. .. .. .. ..$ Resource URL : chr "www.biomart.org:80"
.. .. .. .. ..$ BioMart database : chr "ensembl"
.. .. .. .. ..$ BioMart database version : chr "ENSEMBL GENES 75 (SANGER UK)"
.. .. .. .. ..$ BioMart dataset : chr "hsapiens_gene_ensembl"
.. .. .. .. ..$ BioMart dataset description : chr "Homo sapiens genes (GRCh37.p13)"
.. .. .. .. ..$ BioMart dataset version : chr "GRCh37.p13"
.. .. .. .. ..$ Full dataset : chr "yes"
.. .. .. .. ..$ miRBase build ID : chr NA
.. .. .. .. ..$ transcript_nrow : chr "215647"
.. .. .. .. ..$ exon_nrow : chr "745593"
.. .. .. .. ..$ cds_nrow : chr "537555"
.. .. .. .. ..$ Db created by : chr "GenomicFeatures package from Bioconductor"
.. .. .. .. ..$ Creation time : chr "2014-07-10 14:55:55 -0400 (Thu, 10 Jul 2014)"
.. .. .. .. ..$ GenomicFeatures version at creation time: chr "1.17.9"
.. .. .. .. ..$ RSQLite version at creation time : chr "0.11.4"
.. .. .. .. ..$ DBSCHEMAVERSION : chr "1.0"
..@ colData :Formal class 'DataFrame' [package "IRanges"] with 6 slots
.. .. ..@ rownames : chr [1:8] "SRR1039508" "SRR1039509" "SRR1039512" "SRR1039513" ...
.. .. ..@ nrows : int 8
.. .. ..@ listData :List of 9
.. .. .. ..$ SampleName: Factor w/ 8 levels "GSM1275862","GSM1275863",..: 1 2 3 4 5 6 7 8
.. .. .. ..$ cell : Factor w/ 4 levels "N052611","N061011",..: 4 4 1 1 3 3 2 2
.. .. .. ..$ dex : Factor w/ 2 levels "trt","untrt": 2 1 2 1 2 1 2 1
.. .. .. ..$ albut : Factor w/ 1 level "untrt": 1 1 1 1 1 1 1 1
.. .. .. ..$ Run : Factor w/ 8 levels "SRR1039508","SRR1039509",..: 1 2 3 4 5 6 7 8
.. .. .. ..$ avgLength : int [1:8] 126 126 126 87 120 126 101 98
.. .. .. ..$ Experiment: Factor w/ 8 levels "SRX384345","SRX384346",..: 1 2 3 4 5 6 7 8
.. .. .. ..$ Sample : Factor w/ 8 levels "SRS508567","SRS508568",..: 2 1 3 4 5 6 7 8
.. .. .. ..$ BioSample : Factor w/ 8 levels "SAMN02422669",..: 1 4 6 2 7 3 8 5
.. .. ..@ elementType : chr "ANY"
.. .. ..@ elementMetadata: NULL
.. .. ..@ metadata : list()
..@ assays :Reference class 'ShallowSimpleListAssays' [package "GenomicRanges"] with 1 field
.. ..$ data:Formal class 'SimpleList' [package "IRanges"] with 4 slots
.. .. .. ..@ listData :List of 1
.. .. .. .. ..$ counts: int [1:64102, 1:8] 679 0 467 260 60 0 3251 1433 519 394 ...
.. .. .. ..@ elementType : chr "ANY"
.. .. .. ..@ elementMetadata: NULL
.. .. .. ..@ metadata : list()
.. ..and 12 methods.
..@ NAMES : NULL
..@ elementMetadata:Formal class 'DataFrame' [package "S4Vectors"] with 6 slots
.. .. ..@ rownames : NULL
.. .. ..@ nrows : int 64102
.. .. ..@ listData : Named list()
.. .. ..@ elementType : chr "ANY"
.. .. ..@ elementMetadata: NULL
.. .. ..@ metadata : list()
..@ metadata :List of 1
.. ..$ :Formal class 'MIAME' [package "Biobase"] with 13 slots
.. .. .. ..@ name : chr "Himes BE"
.. .. .. ..@ lab : chr NA
.. .. .. ..@ contact : chr ""
.. .. .. ..@ title : chr "RNA-Seq transcriptome profiling identifies CRISPLD2 as a glucocorticoid responsive gene that modulates cytokine"| __truncated__
.. .. .. ..@ abstract : chr "Asthma is a chronic inflammatory respiratory disease that affects over 300 million people worldwide. Glucocorti"| __truncated__
.. .. .. ..@ url : chr "http://www.ncbi.nlm.nih.gov/pubmed/24926665"
.. .. .. ..@ pubMedIds : chr "24926665"
.. .. .. ..@ samples : list()
.. .. .. ..@ hybridizations : list()
.. .. .. ..@ normControls : list()
.. .. .. ..@ preprocessing : list()
.. .. .. ..@ other : list()
.. .. .. ..@ .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slot
.. .. .. .. .. ..@ .Data:List of 2
.. .. .. .. .. .. ..$ : int [1:3] 1 0 0
.. .. .. .. .. .. ..$ : int [1:3] 1 1 0
扫描 S4
结构化数据列表,我看到了这一行:
.. .. .. ..$ dex : Factor w/ 2 levels "trt","untrt": 2 1 2 1 2 1 2 1
因此 dex
项确实具有“trt”和“untrt”作为值,但该“列”在整个 DesignedExperiment
结构中位于更深的位置。可能有一个特定的功能,我不知道它的名字,从这样的结构中提取值,但我们现在有足够的信息来回答(或破解)这个问题。按照该嵌套列表中的名称和运算符向后返回其原点,并使用 S4 提取运算符:在适当的地方使用“@”,在不适当的地方使用 $
:
sum( airway@ colData @ listData $ dex == "trt")
#[1] 4