R 简化 gsub() 以从更长的字符串中生成样本名称

R simplify gsub() to make sample names from longer string

我有一个样本名称列表

name <- c("GOM_13M_TB-01_S.HM (Q)30",
"GOM_13M_PS-06_S.HM (Q)30",
"GOM_13O_PS-06_3C_HM (Q)30",
"GOM_14O_GI-02_B3 (Q)30",
"GOM_14O_PS-03_A3 (Q)30",
"GOM_12J_GI-01_MS (Q)30")'

需要简化为

13M_TB-01_MS  (MS for consistency)
13M_PS-06_MS
13O_PS-06_3C  (I am not too concerned about the last 2 digits order)
14O_GI-02_B3
14O_PS-03_A3
12J_GI-01_MS

我尝试了 gsub() 的以下用途,但我试图简化解决方案。

x <- gsub("GOM_", "", name) 
x <- gsub("\(Q\)30", "", x)
x <- gsub("_S", "_MS", x)
x <- gsub(".HM", "", x)

有什么建议吗?

也许您可以尝试以下操作:

gsub("GOM_(.*) .*", "\1", gsub("S.HM", "MS", name))
# [1] "13M_TB-01_MS"    "13M_PS-06_MS"    "13O_PS-06_3C_HM" "14O_GI-02_B3"   
# [5] "14O_PS-03_A3"    "12J_GI-01_MS" 

或者,也许:

## I think this matches what you're expecting...
substr(gsub("S.HM", "MS", name), 5, 16)
# [1] "13M_TB-01_MS" "13M_PS-06_MS" "13O_PS-06_3C" "14O_GI-02_B3"
# [5] "14O_PS-03_A3" "12J_GI-01_MS"