如何将我拥有的 table 一栏中的信息分成 3 个单独的栏？

Question

比如我有的table的一栏是这样的

HGVS.Consequence
    Lys10Arg
    Lys10Lys
    LeullLeu
    Phe12Ser
    Phe12Cys
    lle13Leu
    lle13Val
    lle13Phe
    Thr15Pro

我想要一个这样的 table。

Mutation  Ref  Change Position
lle13Val  lle   Val      13
lle13Phe  lle   Phe      13
Thr15Pro  Thr   Pro      15

Answer 1

代码

这是 substr 的基本 R 方式。

sepfun <- function(x){
  s1 <- substr(x, 1, 3)
  s2 <- substr(x, 4, 5)
  s3 <- substring(x, 6)
  y <- do.call(cbind.data.frame, list(s1, s3, s2))
  names(y) <- c("Ref", "Change", "Position")
  cbind(Mutation = x, y)
}

sepfun(df1$HGVS.Consequence)
#>   Mutation Ref Change Position
#> 1 Lys10Arg Lys    Arg       10
#> 2 Lys10Lys Lys    Lys       10
#> 3 LeullLeu Leu    Leu       ll
#> 4 Phe12Ser Phe    Ser       12
#> 5 Phe12Cys Phe    Cys       12
#> 6 lle13Leu lle    Leu       13
#> 7 lle13Val lle    Val       13
#> 8 lle13Phe lle    Phe       13
#> 9 Thr15Pro Thr    Pro       15

^{由 reprex package (v2.0.1)}

于 2022-02-13 创建

数据

HGVS.Consequence<-scan(text = '
Lys10Arg
Lys10Lys
LeullLeu
Phe12Ser
Phe12Cys
lle13Leu
lle13Val
lle13Phe
Thr15Pro
', sep = "\n", what = character())
df1 <- data.frame(HGVS.Consequence)

^{由 reprex package (v2.0.1)}

于 2022-02-13 创建

Answer 2

tidyr::extract(df, HGVS.Consequence, 
     c('Ref', 'Position', 'Change'), '(\D+)(\d+)(\D+)', remove = FALSE)

Answer 3

使用 tidyr::separate 并使用 dplyr 整理排序/名称：

tidyr::separate(data   = df, 
                col    = HGVS.Consequence, 
                into   = c("Ref", "Position", "Change"), 
                sep    = c(3, 5, 8), 
                remove = FALSE) |>
  dplyr::select(1, 2, 4, 3) |>
  dplyr::rename(mutation = HGVS.Consequence)

#>   mutation Ref Change Position
#> 1 Lys10Arg Lys    Arg       10
#> 2 Lys10Lys Lys    Lys       10
#> 3 LeullLeu Leu    Leu       ll
#> 4 Phe12Ser Phe    Ser       12
#> 5 Phe12Cys Phe    Cys       12
#> 6 lle13Leu lle    Leu       13
#> 7 lle13Val lle    Val       13
#> 8 lle13Phe lle    Phe       13
#> 9 Thr15Pro Thr    Pro       15

如何将我拥有的 table 一栏中的信息分成 3 个单独的栏？

How can I separate the information in one column of the table I have into 3 separate columns?

r

gsub

代码

数据