任何方式强制 "list" 到 S4 "List"?

Any way coercing "list" to S4 "List"?

有没有办法将简单的类列表对象强制转换为 S4 "List" 对象?我需要对我的数据进行一些矢量化。显然,我在我的函数中使用了 nested-lapply,并且我检查它的 return 类型为 "list"。我想要 "List" 之类的对象。我怎样才能做到这一点?谢谢。

这是澄清问题的可重现示例:

数据

    foo <- GRanges(
      seqnames=Rle(c("chr1", "chr2", "chr3", "chr4"), c(3, 2, 1, 2)),
      ranges=IRanges(seq(1, by=9, len=8), seq(7, by=9, len=8)),
      rangeName=letters[seq(1:8)], score=sample(1:20, 8, replace = FALSE))

    bar <- GRanges(
      seqnames=Rle(c("chr1", "chr2", "chr3","chr4"), c(4, 3, 1, 1)),
      ranges=IRanges(seq(2, by=5, len=9), seq(4, by=5, len=9)),
      rangeName=letters[seq(1:9)], score=sample(1:20, 9, replace = FALSE))

    moo <- GRanges(
      seqnames=Rle(c("chr1", "chr2", "chr3","chr4"), c(3, 4, 2,1)),
      ranges=IRanges(seq(5, by=7, len=10), seq(8, by=7, len=10)),
      rangeName=letters[seq(1:10)], score=sample(1:20, 10, replace = FALSE))

重叠命中指数

    grl <- GRangesList(bar, moo)
    res <- lapply(grl, function(ele_) {
        tmp <- as(findOverlaps(foo, ele_), "List")
      })

重复区域的解释(第一个列表元素对应于条):

[[1]]
IntegerList of length 8
[[1]] 1 2    # 1st regions from foo overlapped with 1st,2nd regions from bar
[[2]] 3
[[3]] 4
[[4]] 6 7    # 1st regions from foo overlapped with 6st,7th regions from bar 

objective只保留一个(a.k.a,去掉多个相交的区域),如:

[[1]]
IntegerList of length 8
[[1]] 2   # only keep 2nd region from bar
[[2]] 3
[[3]] 4
[[4]] 6 7 # only keep 6th region from bar

删除重复区域

obj.ov <- lapply(res, function(ele_) {
  re <- lapply(grl, function(obj) {
    id0 <- as(which.max(extractList(obj$score, ele_)), "List")
    id0 <- id0[!is.na(id0)]
  })
  re <- re[!duplicated(re)]
})

进一步的步骤

as.obj.ov <- as(obj.ov, "List") # 如果这个强制转换不对,就不能像 obj.ov

一样展开

那么,as.obj.ov 必须像 obj.ov 一样可扩展为命中索引向量,也类型必须是 S4 "List" 对象。

我需要让 obj.ov 作为 S4 "List" 对象。可以在 R 中进行这种强制转换吗?

感谢任何可能的方法、解决方案或想法。

我们可以使用 select = "first" 来获得第一个匹配项。

lapply(grl, function(ele_) {
  ix <- findOverlaps(foo, ele_, select = "first")
  ele_[ix[!is.na(ix)]]
})

[[1]]
GRanges object with 4 ranges and 2 metadata columns:
      seqnames    ranges strand |   rangeName     score
         <Rle> <IRanges>  <Rle> | <character> <integer>
  [1]     chr1  [ 2,  4]      * |           a        18
  [2]     chr1  [12, 14]      * |           c         2
  [3]     chr1  [17, 19]      * |           d        19
  [4]     chr2  [27, 29]      * |           f        15
  -------
  seqinfo: 4 sequences from an unspecified genome; no seqlengths

[[2]]
GRanges object with 6 ranges and 2 metadata columns:
      seqnames    ranges strand |   rangeName     score
         <Rle> <IRanges>  <Rle> | <character> <integer>
  [1]     chr1  [ 5,  8]      * |           a        11
  [2]     chr1  [12, 15]      * |           b        13
  [3]     chr1  [19, 22]      * |           c        14
  [4]     chr2  [26, 29]      * |           d        20
  [5]     chr2  [40, 43]      * |           f         8
  [6]     chr4  [68, 71]      * |           j         1
  -------
  seqinfo: 4 sequences from an unspecified genome; no seqlength