R反转数据框中的字符串
R reverse strings in a data frame
我有一个大数据框,如果它们与列引用不同,我想反转字符串,例如,我想将 GA 更改为 AG 并保持其余部分不变。
structure(list(number = c("rs1", "rs2", "rs3", "rs4", "rs5",
"rs6"), ref = c("AG", "AG", "AG", "AG", "AC", "AC"), s1 = c("GA",
"AG", "GA", "AG", "CA", "AA"), s2 = c("AA", "GG", "GA", "AA",
"AA", "AC"), s3 = c("GG", "AG", "GG", "AA", "CC", "AC"), s4 = c("GA",
"GG", "GA", "AA", "AA", "CC"), s5 = c("AA", "GG", "GA", "GG",
"AA", "CC"), s6 = c("AA", "AG", "GG", "AG", "AA", "CC")), .Names =
c("number",
"ref", "s1", "s2", "s3", "s4", "s5", "s6"), class = "data.frame",
row.names = c(NA,
-6L))
Input:
number ref s1 s2 s3 s4 s5 s6 ...
rs1 AG GA AA GG GA AA AA ...
rs2 AG AG GG AG GG GG AG ...
rs3 AG GA GA GG GA GA GG ...
rs4 AG AG AA AA AA GG AG ...
rs5 AC CA AA CC AA AA AA ...
rs6 AC AA AC AC CC CC CC ...
Desired output:
number ref s1 s2 s3 s4 s5 s6 ...
rs1 AG AG AA GG AG AA AA ...
rs2 AG AG GG AG GG GG AG ...
rs3 AG AG AG GG AG AG GG ...
rs4 AG AG AA AA AA GG AG ...
rs5 AC AC AA CC AA AA AA ...
rs6 AC AA AC AC CC CC CC ...
我尝试使用库(stingi) stri_reverse 函数
df.1 <- c(df[1:2],sapply(df[3:length(df)], function(x) stri_reverse[[x]]))
stri_reverse[[x]] 中的错误:'closure' 类型的对象不是子集
错误来自于您试图使用 [[
对函数 stri_reverse
进行子集化(可能是打字错误?);此外,您还需要稍微调整一下逻辑以获得所需的内容:
library(stringi)
df[-c(1,2)] <- lapply(df[-c(1,2)], function(col) {
rev_col = stri_reverse(col)
ifelse(rev_col == df$ref, rev_col, col)
})
df
# number ref s1 s2 s3 s4 s5 s6
#1 rs1 AG AG AA GG AG AA AA
#2 rs2 AG AG GG AG GG GG AG
#3 rs3 AG AG AG GG AG AG GG
#4 rs4 AG AG AA AA AA GG AG
#5 rs5 AC AC AA CC AA AA AA
#6 rs6 AC AA AC AC CC CC CC
我有一个大数据框,如果它们与列引用不同,我想反转字符串,例如,我想将 GA 更改为 AG 并保持其余部分不变。
structure(list(number = c("rs1", "rs2", "rs3", "rs4", "rs5",
"rs6"), ref = c("AG", "AG", "AG", "AG", "AC", "AC"), s1 = c("GA",
"AG", "GA", "AG", "CA", "AA"), s2 = c("AA", "GG", "GA", "AA",
"AA", "AC"), s3 = c("GG", "AG", "GG", "AA", "CC", "AC"), s4 = c("GA",
"GG", "GA", "AA", "AA", "CC"), s5 = c("AA", "GG", "GA", "GG",
"AA", "CC"), s6 = c("AA", "AG", "GG", "AG", "AA", "CC")), .Names =
c("number",
"ref", "s1", "s2", "s3", "s4", "s5", "s6"), class = "data.frame",
row.names = c(NA,
-6L))
Input:
number ref s1 s2 s3 s4 s5 s6 ...
rs1 AG GA AA GG GA AA AA ...
rs2 AG AG GG AG GG GG AG ...
rs3 AG GA GA GG GA GA GG ...
rs4 AG AG AA AA AA GG AG ...
rs5 AC CA AA CC AA AA AA ...
rs6 AC AA AC AC CC CC CC ...
Desired output:
number ref s1 s2 s3 s4 s5 s6 ...
rs1 AG AG AA GG AG AA AA ...
rs2 AG AG GG AG GG GG AG ...
rs3 AG AG AG GG AG AG GG ...
rs4 AG AG AA AA AA GG AG ...
rs5 AC AC AA CC AA AA AA ...
rs6 AC AA AC AC CC CC CC ...
我尝试使用库(stingi) stri_reverse 函数
df.1 <- c(df[1:2],sapply(df[3:length(df)], function(x) stri_reverse[[x]]))
stri_reverse[[x]] 中的错误:'closure' 类型的对象不是子集
错误来自于您试图使用 [[
对函数 stri_reverse
进行子集化(可能是打字错误?);此外,您还需要稍微调整一下逻辑以获得所需的内容:
library(stringi)
df[-c(1,2)] <- lapply(df[-c(1,2)], function(col) {
rev_col = stri_reverse(col)
ifelse(rev_col == df$ref, rev_col, col)
})
df
# number ref s1 s2 s3 s4 s5 s6
#1 rs1 AG AG AA GG AG AA AA
#2 rs2 AG AG GG AG GG GG AG
#3 rs3 AG AG AG GG AG AG GG
#4 rs4 AG AG AA AA AA GG AG
#5 rs5 AC AC AA CC AA AA AA
#6 rs6 AC AA AC AC CC CC CC