根据字符位置在 R 中拆分字符串

Split String In R Based On Character Location

我正在尝试将 R 中的这些字符串(列条目)拆分为三个单独的列:

João Moutinho Monaco, 30,  M(C) 
Clinton N'Jie Marseille, 23,  FW
Frederic Sammaritano Dijon, 30,  AM(LR)

成为

Player                Team           Pos
João Moutinho         Monaco         30,  M(C) 
Clinton N'Jie         Marseille      23,  FW
Frederic Sammaritano  Dijon          30,  AM(LR)

我可以使用 gregexpr 和 nchar 找到字符的位置,但我不确定如何使用 strsplit。或者也许另一个包更容易?

在使用 gsub

创建定界符后,我们可以使用 read.csv 将向量读入 data.frame
read.csv(text=gsub("^(\S+\s+\S+)\s+(\S+),\s+(.*)", 
       "\1;\2;\3", v1), sep=";", header=FALSE, 
       col.names = c("Player", "Team", "Pos"), stringsAsFactors=FALSE)
#                Player      Team         Pos
#1        João Moutinho    Monaco   30,  M(C)
#2        Clinton N'Jie Marseille     23,  FW
#3 Frederic Sammaritano     Dijon 30,  AM(LR)

更新

如果我们有更多模式并且 "Team" 名称只有一个单词(即在第一个 ',' 之前)

read.csv(text= sub("(\s+[A-Za-z]+),(\s+\d+),(.*)", ";\1;\2\3", v2), 
      header=FALSE, sep=";", col.names = c("Player", "Team", "Pos"), stringsAsFactors=FALSE)
#                Player       Team         Pos
#1        João Moutinho     Monaco    30  M(C)
#2        Clinton N'Jie  Marseille      23  FW
#3 Frederic Sammaritano      Dijon  30  AM(LR)
#4       Angel Di María        PSG   28 M(CLR)
#5    Jean Michael Seri       Nice     25 M(C)

数据

v1 <- c("João Moutinho Monaco, 30,  M(C)", "Clinton N'Jie Marseille, 23,  FW", 
                    "Frederic Sammaritano Dijon, 30,  AM(LR)")
v2 <- c(v1, "Angel Di María PSG, 28, M(CLR)","Jean Michael Seri Nice, 25, M(C)")

来自 stringr

word 方法
library(stringr)
data.frame(Player = word(v1, 1, 2), 
             Team = sub(',','' ,word(v1, 3)), 
              Pos = word(v1, 4, 6), stringsAsFactors = FALSE)

#                Player      Team         Pos
#1        João Moutinho    Monaco   30,  M(C)
#2        Clinton N'Jie Marseille     23,  FW
#3 Frederic Sammaritano     Dijon 30,  AM(LR)