姓氏,名字到名字姓氏
Last name, First Name to First Name Last Name
我有一套姓氏格式为姓氏的名字
Name Pos Team Week.x Year.x GID.x h.a.x Oppt.x Week1Points DK.salary.x Week.y Year.y GID.y
1 Abdullah, Ameer RB det 1 2015 2995 a sdg 19.4 4000 2 2015 2995
2 Adams, Davante WR gnb 1 2015 5263 a chi 9.9 4400 2 2015 5263
3 Agholor, Nelson WR phi 1 2015 5378 a atl 1.5 5700 2 2015 5378
4 Aiken, Kamar WR bal 1 2015 5275 a den 0.9 3300 2 2015 5275
5 Ajirotutu, Seyi WR phi 1 2015 3877 a atl 0.0 3000 NA NA NA
6 Allen, Dwayne TE ind 1 2015 4551 a buf 10.7 3400 2 2015 4551
这只是拳头6行。我想将名字翻转为名字姓氏。这是我尝试过的。
> strsplit(DKPoints$Name, split = ",")
这拆分了 name 变量,但是有空格,所以我尝试清除它们,
> str_trim(splitnames)
但是结果不对。这是他们的样子。
[1] "c(\"Abdullah\", \" Ameer\")" "c(\"Adams\", \" Davante\")"
[3] "c(\"Agholor\", \" Nelson\")" "c(\"Aiken\", \" Kamar\")"
[5] "c(\"Ajirotutu\", \" Seyi\")" "c(\"Allen\", \" Dwayne\")"
有什么建议吗?我想让数据框的列看起来像
Ameer Abdullah
Davabte Adams
Nelson Agholor
Kamar Aiken
如有任何建议,我们将不胜感激。谢谢
sub("(\w+),\s(\w+)","\2 \1", df$name)
(\w+)
匹配名字,,\s
匹配", "
(逗号和space),\2 \1
returns顺序相反.
使用srt_split_fixed
的一种方式:
library(stringr)
#split Name into two columns
splits <- str_split_fixed(df$Name, ", ", 2)
#now merge these two columns the other way round
df$Name <- paste(splits[,2], splits[,1], sep = ' ')
输出:
Name Pos Team Week.x Year.x GID.x h.a.x Oppt.x Week1Points DK.salary.x Week.y Year.y GID.y
1 Ameer Abdullah RB det 1 2015 2995 a sdg 19.4 4000 2 2015 2995
2 Davante Adams WR gnb 1 2015 5263 a chi 9.9 4400 2 2015 5263
3 Nelson Agholor WR phi 1 2015 5378 a atl 1.5 5700 2 2015 5378
4 Kamar Aiken WR bal 1 2015 5275 a den 0.9 3300 2 2015 5275
5 Seyi Ajirotutu WR phi 1 2015 3877 a atl 0.0 3000 NA NA NA
6 Dwayne Allen TE ind 1 2015 4551 a buf 10.7 3400 2 2015 4551
假设所有名字都是 "Lastname, firstname" 你可以这样做:
names <- c("A, B","C, D","E, F")
newnames <- sapply(strsplit(names, split=", "),function(x)
{paste(rev(x),collapse=" ")})
> newnames
[1] "B A" "D C" "F E"
它在 ", "
上拆分每个名称,然后以相反的顺序将内容粘贴回一起。
编辑:小数据集可能没问题,但提供的其他解决方案要快得多。 100.000 'names':
的微基准测试结果
Unit: milliseconds
expr min lq mean median uq max neval cld
heroka 1103.0419 1242.6418 1276.7765 1274.6746 1311.1218 1557.8579 50 c
lyzander 149.4466 177.0036 206.4558 191.1249 218.1756 345.7960 50 b
johannes 142.7585 144.5943 151.0078 146.0602 147.1980 284.2589 50 a
试试这个:
df$Name2<-paste(gsub("^.+\,","",df$Name),gsub("\,.+$","",df$Name),sep=" ")
其中 df
是您的数据框。
我有一套姓氏格式为姓氏的名字
Name Pos Team Week.x Year.x GID.x h.a.x Oppt.x Week1Points DK.salary.x Week.y Year.y GID.y
1 Abdullah, Ameer RB det 1 2015 2995 a sdg 19.4 4000 2 2015 2995
2 Adams, Davante WR gnb 1 2015 5263 a chi 9.9 4400 2 2015 5263
3 Agholor, Nelson WR phi 1 2015 5378 a atl 1.5 5700 2 2015 5378
4 Aiken, Kamar WR bal 1 2015 5275 a den 0.9 3300 2 2015 5275
5 Ajirotutu, Seyi WR phi 1 2015 3877 a atl 0.0 3000 NA NA NA
6 Allen, Dwayne TE ind 1 2015 4551 a buf 10.7 3400 2 2015 4551
这只是拳头6行。我想将名字翻转为名字姓氏。这是我尝试过的。
> strsplit(DKPoints$Name, split = ",")
这拆分了 name 变量,但是有空格,所以我尝试清除它们,
> str_trim(splitnames)
但是结果不对。这是他们的样子。
[1] "c(\"Abdullah\", \" Ameer\")" "c(\"Adams\", \" Davante\")"
[3] "c(\"Agholor\", \" Nelson\")" "c(\"Aiken\", \" Kamar\")"
[5] "c(\"Ajirotutu\", \" Seyi\")" "c(\"Allen\", \" Dwayne\")"
有什么建议吗?我想让数据框的列看起来像
Ameer Abdullah
Davabte Adams
Nelson Agholor
Kamar Aiken
如有任何建议,我们将不胜感激。谢谢
sub("(\w+),\s(\w+)","\2 \1", df$name)
(\w+)
匹配名字,,\s
匹配", "
(逗号和space),\2 \1
returns顺序相反.
使用srt_split_fixed
的一种方式:
library(stringr)
#split Name into two columns
splits <- str_split_fixed(df$Name, ", ", 2)
#now merge these two columns the other way round
df$Name <- paste(splits[,2], splits[,1], sep = ' ')
输出:
Name Pos Team Week.x Year.x GID.x h.a.x Oppt.x Week1Points DK.salary.x Week.y Year.y GID.y
1 Ameer Abdullah RB det 1 2015 2995 a sdg 19.4 4000 2 2015 2995
2 Davante Adams WR gnb 1 2015 5263 a chi 9.9 4400 2 2015 5263
3 Nelson Agholor WR phi 1 2015 5378 a atl 1.5 5700 2 2015 5378
4 Kamar Aiken WR bal 1 2015 5275 a den 0.9 3300 2 2015 5275
5 Seyi Ajirotutu WR phi 1 2015 3877 a atl 0.0 3000 NA NA NA
6 Dwayne Allen TE ind 1 2015 4551 a buf 10.7 3400 2 2015 4551
假设所有名字都是 "Lastname, firstname" 你可以这样做:
names <- c("A, B","C, D","E, F")
newnames <- sapply(strsplit(names, split=", "),function(x)
{paste(rev(x),collapse=" ")})
> newnames
[1] "B A" "D C" "F E"
它在 ", "
上拆分每个名称,然后以相反的顺序将内容粘贴回一起。
编辑:小数据集可能没问题,但提供的其他解决方案要快得多。 100.000 'names':
的微基准测试结果Unit: milliseconds
expr min lq mean median uq max neval cld
heroka 1103.0419 1242.6418 1276.7765 1274.6746 1311.1218 1557.8579 50 c
lyzander 149.4466 177.0036 206.4558 191.1249 218.1756 345.7960 50 b
johannes 142.7585 144.5943 151.0078 146.0602 147.1980 284.2589 50 a
试试这个:
df$Name2<-paste(gsub("^.+\,","",df$Name),gsub("\,.+$","",df$Name),sep=" ")
其中 df
是您的数据框。