字符串从 n-th 最后一个分隔符删除到最后

String remove from n-th last seperator to the end

我有以下字符串:

data_string = c("Aa_Bbbbb_0_ID1",
                "Aa_Bbbbb_0_ID2",
                "Aa_Bbbbb_0_ID3",
                "Ccccc_D_EEE_0_ID1")

我只想拆分所有字符串以获得这些结果:

"Aa_Bbbbb"
"Aa_Bbbbb"
"Aa_Bbbbb"
"Ccccc_D_EEE"

所以基本上,我正在寻找一个采用 data_string、设置分隔符并采用拆分位置的函数:

remove_tail(data_table, sep = '_', del = 2)

仅删除从倒数第二个分隔符到字符串末尾的尾巴(不拆分所有字符串)

使用gsub

gsub("_0_.*","",data_string)

试试下面的方法:

# split on "_" then paste back removing last 2
sapply(strsplit(data_string, "_", fixed = TRUE),
       function(i) paste(head(i, -2), collapse = "_"))

我们可以自己做函数:

# custom function
remove_tail <- function(x, sep = "_", del = 2){
  sapply(strsplit(x, split = sep, fixed = TRUE),
         function(i) paste(head(i, -del), collapse = sep))
  }

remove_tail(data_string, sep = '_', del = 2)
# [1] "Aa_Bbbbb"    "Aa_Bbbbb"    "Aa_Bbbbb"    "Ccccc_D_EEE"

我们也可以使用sub tp匹配_后跟一个或多个数字(\d+)和其余字符,将其替换为空白("")

sub("_\d+.*", "", data_string)
#[1] "Aa_Bbbbb"    "Aa_Bbbbb"    "Aa_Bbbbb"    "Ccccc_D_EEE"