将字符串后的所有内容复制到 R 中的不同列中

Question

我在数据框中有一些列如下所示：

df <- data.frame(act=c("DEC S/N, de 21/06/2006",
                        "DEC S/N, de 05/06/2006",
                         "DEC S/N, de 21/06/2006; MP 542, de 12/08/2011; LEI 12.678, de 25/06/2012"), adj=NA)

我想复制第一个之后的所有内容；（MP 542，de 12/08/2011；LEI 12.678，de 25/06/2012）在 'act' 栏中，进入 'adj' 栏。理想情况下，删除将留在截止字符串星号处的 space 。所有其他单元格，即 'act' 列中的字符串没有 ;应在 'adj'.

列中保留 NA

Answer 1

使用 stringr 中的 str_match :

df <- data.frame(act=c("DEC S/N, de 21/06/2006",
                       "DEC S/N, de 05/06/2006",
                       "DEC S/N, de 21/06/2006; MP 542, de 12/08/2011; LEI 12.678, de 25/06/2012"), adj=NA)
df %>% mutate(adj = str_match(act, "[^;]*;(.*)")[,2])

Answer 2

使用stringr::str_extract-

df$adj <- stringr::str_extract(df$act, '(?<=;\s)(.*)')
df$adj
#[1] NA   NA    "MP 542, de 12/08/2011; LEI 12.678, de 25/06/2012"

Answer 3

df %>%
  extract(act, 'adj', '; (.*)', remove = FALSE)

甚至尝试：

df %>%
  separate(act, c('act1', 'adj'), '; ', 
           extra = 'merge', fill = 'right', remove = FALSE)

将字符串后的所有内容复制到 R 中的不同列中

copy everything after a string into a different column in R

r

stringr