从 R 中不同位置的字符串中提取字母
Extract letters from a string at different positions in R
我有两列,我想从不同的位置提取字母。目标是显示 Col2 中使用了什么字母来替换 Col1 中的字母。字母将根据 Position 列从 Col1 和 Col2 中提取。在位置列中,字母 "E" 表示将用于提取字母的位置。
这是我尝试使用 substr
函数的结果:
df <- data.frame ("Col1" = c("Stores","University","Street","Street Store"),
"Col2" = c("Ostues", "Unasersity", "Straeq","Straeq Stuwq"),
"Position" = c("EMMEMM","MMEEMMMMMM", "MMMEME","MMMEMEMMMEEE"),
"Desired Output" = c("S|O , r|u","i|a , v|s","e|a , t|q", "e|a , t|q , o|u , r|w , e|q"))
n <- which(strsplit(df$Position,"")[[1]]=="E")
#output for the first row:
# [1] 1 4
#then I used substr function:
substr(df$Col1, n, n)
#only the first character returned as below:
[1] "S"
#desired output for first row:
S|O , r|u
首先我会做一个辅助函数来从一个位置提取一个字符
subchr <- function(x, pos) {
substring(x, pos, pos)
}
然后就可以找到所有要提取的位置了
extract_at <- lapply(strsplit(as.character(df$Position), ""),
function(x) which(x=="E"))
并将它们放在一起以获得您想要的输出
mapply(function(e, a, b){
paste(subchr(a, e), subchr(b,e), sep="|", collapse=" , ")
}, extract_at, as.character(df$Col1), as.character(df$Col2))
# [1] "S|O , r|u" "i|a , v|s" "e|a , t|q"
可能是这样的:
df %>% mutate(x=str_replace_all(chartr("M",".",Position),"E","\(\.\)"),
output=paste0(str_replace(Col1,x,"\1"),"|",str_replace(Col2,x,"\1"),
" , ",str_replace(Col1,x,"\2"),"|",str_replace(Col2,x,"\2")))
# Col1 Col2 Position Desired.Output x output
#1 Stores Ostues EMMEMM S|O , r|u (.)..(.).. S|O , r|u
#2 University Unasersity MMEEMMMMMM i|a , v|s ..(.)(.)...... i|a , v|s
#3 Street Straeq MMMEME e|a , t|q ...(.).(.) e|a , t|q
数据:
df <- data.frame ("Col1" = c("Stores","University","Street"),
"Col2" = c("Ostues", "Unasersity", "Straeq"),
"Position" = c("EMMEMM","MMEEMMMMMM", "MMMEME"),
"Desired Output" = c("S|O , r|u","i|a , v|s","e|a , t|q"))
我有两列,我想从不同的位置提取字母。目标是显示 Col2 中使用了什么字母来替换 Col1 中的字母。字母将根据 Position 列从 Col1 和 Col2 中提取。在位置列中,字母 "E" 表示将用于提取字母的位置。
这是我尝试使用 substr
函数的结果:
df <- data.frame ("Col1" = c("Stores","University","Street","Street Store"),
"Col2" = c("Ostues", "Unasersity", "Straeq","Straeq Stuwq"),
"Position" = c("EMMEMM","MMEEMMMMMM", "MMMEME","MMMEMEMMMEEE"),
"Desired Output" = c("S|O , r|u","i|a , v|s","e|a , t|q", "e|a , t|q , o|u , r|w , e|q"))
n <- which(strsplit(df$Position,"")[[1]]=="E")
#output for the first row:
# [1] 1 4
#then I used substr function:
substr(df$Col1, n, n)
#only the first character returned as below:
[1] "S"
#desired output for first row:
S|O , r|u
首先我会做一个辅助函数来从一个位置提取一个字符
subchr <- function(x, pos) {
substring(x, pos, pos)
}
然后就可以找到所有要提取的位置了
extract_at <- lapply(strsplit(as.character(df$Position), ""),
function(x) which(x=="E"))
并将它们放在一起以获得您想要的输出
mapply(function(e, a, b){
paste(subchr(a, e), subchr(b,e), sep="|", collapse=" , ")
}, extract_at, as.character(df$Col1), as.character(df$Col2))
# [1] "S|O , r|u" "i|a , v|s" "e|a , t|q"
可能是这样的:
df %>% mutate(x=str_replace_all(chartr("M",".",Position),"E","\(\.\)"),
output=paste0(str_replace(Col1,x,"\1"),"|",str_replace(Col2,x,"\1"),
" , ",str_replace(Col1,x,"\2"),"|",str_replace(Col2,x,"\2")))
# Col1 Col2 Position Desired.Output x output
#1 Stores Ostues EMMEMM S|O , r|u (.)..(.).. S|O , r|u
#2 University Unasersity MMEEMMMMMM i|a , v|s ..(.)(.)...... i|a , v|s
#3 Street Straeq MMMEME e|a , t|q ...(.).(.) e|a , t|q
数据:
df <- data.frame ("Col1" = c("Stores","University","Street"),
"Col2" = c("Ostues", "Unasersity", "Straeq"),
"Position" = c("EMMEMM","MMEEMMMMMM", "MMMEME"),
"Desired Output" = c("S|O , r|u","i|a , v|s","e|a , t|q"))