R:按照关系将数字串翻译成字母串 table

R: translate strings of numbers into strings of letters following a relationships table

我有一个向量 mynumbers 有几个数字串,比如说:

mynumbers <- c("122212", "134134", "134134", "142123", "212141", "213243", "213422", "214231", "221233")

我的目标是将此类字符串转换为遵循以下关系的字母字符串:

1=A
2=C
3=G
4=T

我想将其封装在一个函数中,以便:

myletters <- translate_function(mynumbers)

myletters 因此是:

myletters <- c("ACCCAC", "AGTAGT", "AGTAGT", "ATCACG", "CACATA", "CAGCTG", "CAGTCC", "CATCGA", "CCACGG")

我在想这样的函数,显然不正确...我在处理 strsplit 和列表时开始感到困惑...

translate_function <- function(numbers){
  map_df <- data.frame(num=1:4, nuc=c('A','C','G','T'))
  #strsplit numbers
  split_numbers <- strsplit(numbers, '')
  letters <- paste(sapply(split_numbers, function(x) map_df$nuc[which(map_df$num==x)]), collapse='')
  
  return(letters)
}

完成此任务最简单、最优雅的方法是什么?谢谢!

轻松通过chartr,

chartr("1234" , "ACGT", mynumbers)
[1] "ACCCAC" "AGTAGT" "AGTAGT" "ATCACG" "CACATA" "CAGCTG" "CAGTCC" "CATCGA"
[9] "CCACGG"

以这种方式在函数中使用它:

translate_function <- function(numbers){
  map_df <- data.frame(num=1:4, nuc=c('A','C','G','T'))
  letters <- chartr(paste(map_df$num, collapse=''), paste(map_df$nuc, collapse=''), numbers)
  return(letters)
}
translate_function(mynumbers)

输出:

[1] "ACCCAC" "AGTAGT" "AGTAGT" "ATCACG" "CACATA" "CAGCTG" "CAGTCC" "CATCGA"
[9] "CCACGG"

但是没有数据框更好:

translate_function <- function(numbers){
  letters <- chartr("1234", "ACGT", numbers)
  return(letters)
}
translate_function(mynumbers)

输出:

[1] "ACCCAC" "AGTAGT" "AGTAGT" "ATCACG" "CACATA" "CAGCTG" "CAGTCC" "CATCGA"
[9] "CCACGG"

您可以使用 stringr::str_replace_allmap_df 创建一个命名向量来替换。

map_df <- data.frame(num=1:4, nuc=c('A','C','G','T'))
stringr::str_replace_all(mynumbers, setNames(map_df$nuc, map_df$num))

#[1] "ACCCAC" "AGTAGT" "AGTAGT" "ATCACG" "CACATA" "CAGCTG" "CAGTCC" "CATCGA" "CCACGG"

使用gsubfn

library(gsubfn)
gsubfn("(\d)", setNames(as.list(c("A", "C", "G", "T")), 1:4), mynumbers)
[1] "ACCCAC" "AGTAGT" "AGTAGT" "ATCACG" "CACATA" "CAGCTG" "CAGTCC" "CATCGA" "CCACGG"