从因子或数据框中提取内容

Extracting contents from factor or data frame

这是 X 的一个例子 - 一个因子(它是数据帧的一部分):

[1] "[[1]]"              "J48"                "------------------" ""                   "MSTV"              
 [6] "|"                  "|"                  "|"                  "|"                  "|"                 
[11] "|"                  "|"                  "|"                  "MSTV"               "|"                 
[16] "|"                  "|"                  "|"                  "|"                  "|"                 
[21] "|"                  "|"                  "|"                  "|"                  "|"                 
[26] "|"                  "|"                  "|"                  "|"                  "|"                 
[31] "|"                  "|"                  "|"                  "|"                  ""                  
[36] "Number"             ""                   "Size"               ""                   "like"              
[41] ""                   "The"  

我想提取一个词MSTV(出现了两次)。我想忽略所有其他词和 |迹象。 MSTV伴随|它出现前后的符号。我尝试使用命令: gsub("[A-Z][1-9]:", "", X) 没有成功。我怎样才能提取 MSTV 这个词(它可能出现在 | 符号中间的任何地方?

我觉得你是这个意思,

library(stringr)
x <- c("|","MSTV","|","s","",":")
str_extract(paste0(x, collapse=""), perl("(?<=\|)[A-Za-z]+(?=\|)"))
#[1] "MSTV"