使用 str split 但基于特殊字符的位置

Use str split but based on position of specials character

嗨,谁能帮我解决这个问题??

C <- "NURUL AMANI [ID 26378] [IC 971035186514] SYED SAHARR [ID 61839] [IC 981627015412]"
str_split(C, "\]")

结果是这样的

[1]"NURUL AMANI [ID 26378" " [IC 971035186514" [3]" SYED SAHARR [ID 61839" " [IC 981627015412"

我想要这样的结果

[1]"NURUL AMANI [ID 26378] [IC 971035186514]" [2]" SYED SAHARR [ID 61839] [IC 981627015412]"
stringr::str_split(C, "(?<=\d\])(?= \w)")

使用向后看(https://www.regular-expressions.info/lookaround.html)找到边界。

编辑:根据@dc37 的评论修复

看起来正则表达式可能更适合您的问题

str_extract_all(C, "\w+ \w+ \[ID [0-9]+\] \[IC [0-9]+\]")

使用base,你可以这样做:

strsplit(C,"(?<=\])(?= \w)",perl = TRUE)

并在 SYED 之前使用 space:

> strsplit(C,"(?<=\])(?= \w)",perl = TRUE)
[[1]]
[1] "NURUL AMANI [ID 26378] [IC 971035186514]"  " SYED SAHARR [ID 61839] [IC 981627015412]"

如果你不想保留这个space,你可以这样写:

> strsplit(C,"(?<=\]) (?=\w)",perl = TRUE)
[[1]]
[1] "NURUL AMANI [ID 26378] [IC 971035186514]" "SYED SAHARR [ID 61839] [IC 981627015412]"