R:按照关系将数字串翻译成字母串 table
R: translate strings of numbers into strings of letters following a relationships table
我有一个向量 mynumbers
有几个数字串,比如说:
mynumbers <- c("122212", "134134", "134134", "142123", "212141", "213243", "213422", "214231", "221233")
我的目标是将此类字符串转换为遵循以下关系的字母字符串:
1=A
2=C
3=G
4=T
我想将其封装在一个函数中,以便:
myletters <- translate_function(mynumbers)
myletters
因此是:
myletters <- c("ACCCAC", "AGTAGT", "AGTAGT", "ATCACG", "CACATA", "CAGCTG", "CAGTCC", "CATCGA", "CCACGG")
我在想这样的函数,显然不正确...我在处理 strsplit
和列表时开始感到困惑...
translate_function <- function(numbers){
map_df <- data.frame(num=1:4, nuc=c('A','C','G','T'))
#strsplit numbers
split_numbers <- strsplit(numbers, '')
letters <- paste(sapply(split_numbers, function(x) map_df$nuc[which(map_df$num==x)]), collapse='')
return(letters)
}
完成此任务最简单、最优雅的方法是什么?谢谢!
轻松通过chartr
,
chartr("1234" , "ACGT", mynumbers)
[1] "ACCCAC" "AGTAGT" "AGTAGT" "ATCACG" "CACATA" "CAGCTG" "CAGTCC" "CATCGA"
[9] "CCACGG"
以这种方式在函数中使用它:
translate_function <- function(numbers){
map_df <- data.frame(num=1:4, nuc=c('A','C','G','T'))
letters <- chartr(paste(map_df$num, collapse=''), paste(map_df$nuc, collapse=''), numbers)
return(letters)
}
translate_function(mynumbers)
输出:
[1] "ACCCAC" "AGTAGT" "AGTAGT" "ATCACG" "CACATA" "CAGCTG" "CAGTCC" "CATCGA"
[9] "CCACGG"
但是没有数据框更好:
translate_function <- function(numbers){
letters <- chartr("1234", "ACGT", numbers)
return(letters)
}
translate_function(mynumbers)
输出:
[1] "ACCCAC" "AGTAGT" "AGTAGT" "ATCACG" "CACATA" "CAGCTG" "CAGTCC" "CATCGA"
[9] "CCACGG"
您可以使用 stringr::str_replace_all
从 map_df
创建一个命名向量来替换。
map_df <- data.frame(num=1:4, nuc=c('A','C','G','T'))
stringr::str_replace_all(mynumbers, setNames(map_df$nuc, map_df$num))
#[1] "ACCCAC" "AGTAGT" "AGTAGT" "ATCACG" "CACATA" "CAGCTG" "CAGTCC" "CATCGA" "CCACGG"
使用gsubfn
library(gsubfn)
gsubfn("(\d)", setNames(as.list(c("A", "C", "G", "T")), 1:4), mynumbers)
[1] "ACCCAC" "AGTAGT" "AGTAGT" "ATCACG" "CACATA" "CAGCTG" "CAGTCC" "CATCGA" "CCACGG"
我有一个向量 mynumbers
有几个数字串,比如说:
mynumbers <- c("122212", "134134", "134134", "142123", "212141", "213243", "213422", "214231", "221233")
我的目标是将此类字符串转换为遵循以下关系的字母字符串:
1=A
2=C
3=G
4=T
我想将其封装在一个函数中,以便:
myletters <- translate_function(mynumbers)
myletters
因此是:
myletters <- c("ACCCAC", "AGTAGT", "AGTAGT", "ATCACG", "CACATA", "CAGCTG", "CAGTCC", "CATCGA", "CCACGG")
我在想这样的函数,显然不正确...我在处理 strsplit
和列表时开始感到困惑...
translate_function <- function(numbers){
map_df <- data.frame(num=1:4, nuc=c('A','C','G','T'))
#strsplit numbers
split_numbers <- strsplit(numbers, '')
letters <- paste(sapply(split_numbers, function(x) map_df$nuc[which(map_df$num==x)]), collapse='')
return(letters)
}
完成此任务最简单、最优雅的方法是什么?谢谢!
轻松通过chartr
,
chartr("1234" , "ACGT", mynumbers)
[1] "ACCCAC" "AGTAGT" "AGTAGT" "ATCACG" "CACATA" "CAGCTG" "CAGTCC" "CATCGA"
[9] "CCACGG"
以这种方式在函数中使用它:
translate_function <- function(numbers){
map_df <- data.frame(num=1:4, nuc=c('A','C','G','T'))
letters <- chartr(paste(map_df$num, collapse=''), paste(map_df$nuc, collapse=''), numbers)
return(letters)
}
translate_function(mynumbers)
输出:
[1] "ACCCAC" "AGTAGT" "AGTAGT" "ATCACG" "CACATA" "CAGCTG" "CAGTCC" "CATCGA"
[9] "CCACGG"
但是没有数据框更好:
translate_function <- function(numbers){
letters <- chartr("1234", "ACGT", numbers)
return(letters)
}
translate_function(mynumbers)
输出:
[1] "ACCCAC" "AGTAGT" "AGTAGT" "ATCACG" "CACATA" "CAGCTG" "CAGTCC" "CATCGA"
[9] "CCACGG"
您可以使用 stringr::str_replace_all
从 map_df
创建一个命名向量来替换。
map_df <- data.frame(num=1:4, nuc=c('A','C','G','T'))
stringr::str_replace_all(mynumbers, setNames(map_df$nuc, map_df$num))
#[1] "ACCCAC" "AGTAGT" "AGTAGT" "ATCACG" "CACATA" "CAGCTG" "CAGTCC" "CATCGA" "CCACGG"
使用gsubfn
library(gsubfn)
gsubfn("(\d)", setNames(as.list(c("A", "C", "G", "T")), 1:4), mynumbers)
[1] "ACCCAC" "AGTAGT" "AGTAGT" "ATCACG" "CACATA" "CAGCTG" "CAGTCC" "CATCGA" "CCACGG"