dplyr 单独返回错误
dplyr separate returning error
尝试将结果列从 df2 分成 2 个单独的列("Winner"、"Loser" 并使用以下代码从每个新列中删除数字,但收到以下错误消息。发生了什么变化需要改正吗?
df2 <- data.frame(Year = c(2015:1903 ), Results = c("Winner", "Loser"))
df2 %>% separate(type, c("Winner", "Loser"), ",")
Error in if (!after) c(values, x) else if (after >= lengx) c(x, values) else c(x[1L:after], :
argument is of length zero
df2
Year Results MVP
1 2015 Royals 4, Mets 1 Salvador Perez
2 2014 Giants 4, Royals 3 Madison Bumgarner
3 2013 Red Sox 4, Cardinals 2 David Ortiz
4 2012 Giants 4, Tigers 0 Pablo Sandoval
5 2011 Cardinals 4, Rangers 3 David Freese
.
.
125 1906 Chicago White Sox 4, Chicago Cubs 2 --
126 1905 NY Giants 4, Philadelphia A's 1 --
128 1903 Boston Red Sox 5, Pittsburgh 3 --
这是 separate()
的一种解决方案。在您的代码中,separate()
中有 type
。您可能需要仔细检查一下。在这里,我创建了一个名为 df2
的示例数据框并执行了以下操作。首先,我删除了 space 和 Results
中的数字。然后,我使用 separate()
.
分隔列
library(dplyr)
library(tidyr)
mutate(df2, Results = gsub(pattern = "\s|\d+", replacement = "", x = Results)) %>%
separate(col = "Results", into = c("Winner", "Loser"), sep = ",")
# Year Winner Loser
#1 2000 Royals Mets
#2 2001 Giants Royals
#3 2002 RedSox Cardinals
#4 2003 Giants Tigers
#5 2004 Cardinals Rangers
数据
df2 <- structure(list(Year = 2000:2004, Results = c("Royals 4, Mets 1",
"Giants 4, Royals 3", "Red Sox 4, Cardinals 2", "Giants 4, Tigers 0",
"Cardinals 4, Rangers 3")), .Names = c("Year", "Results"), row.names = c(NA,
-5L), class = "data.frame")
# Year Results
#1 2000 Royals 4, Mets 1
#2 2001 Giants 4, Royals 3
#3 2002 Red Sox 4, Cardinals 2
#4 2003 Giants 4, Tigers 0
#5 2004 Cardinals 4, Rangers 3
您也可以通过使用 tidyr::extract
(使用@jazzurros 数据)
的单个调用来完成此操作
extract(df2,
"Results", c("Winner", "Loser"),
"([[:alpha:] ]+)\s+\d+,\s+([[:alpha:] ]+)\s+\d+")
# Year Winner Loser
# 1 2000 Royals Mets
# 2 2001 Giants Royals
# 3 2002 Red Sox Cardinals
# 4 2003 Giants Tigers
# 5 2004 Cardinals Rangers
尝试将结果列从 df2 分成 2 个单独的列("Winner"、"Loser" 并使用以下代码从每个新列中删除数字,但收到以下错误消息。发生了什么变化需要改正吗?
df2 <- data.frame(Year = c(2015:1903 ), Results = c("Winner", "Loser"))
df2 %>% separate(type, c("Winner", "Loser"), ",")
Error in if (!after) c(values, x) else if (after >= lengx) c(x, values) else c(x[1L:after], :
argument is of length zero
df2
Year Results MVP
1 2015 Royals 4, Mets 1 Salvador Perez
2 2014 Giants 4, Royals 3 Madison Bumgarner
3 2013 Red Sox 4, Cardinals 2 David Ortiz
4 2012 Giants 4, Tigers 0 Pablo Sandoval
5 2011 Cardinals 4, Rangers 3 David Freese
.
.
125 1906 Chicago White Sox 4, Chicago Cubs 2 --
126 1905 NY Giants 4, Philadelphia A's 1 --
128 1903 Boston Red Sox 5, Pittsburgh 3 --
这是 separate()
的一种解决方案。在您的代码中,separate()
中有 type
。您可能需要仔细检查一下。在这里,我创建了一个名为 df2
的示例数据框并执行了以下操作。首先,我删除了 space 和 Results
中的数字。然后,我使用 separate()
.
library(dplyr)
library(tidyr)
mutate(df2, Results = gsub(pattern = "\s|\d+", replacement = "", x = Results)) %>%
separate(col = "Results", into = c("Winner", "Loser"), sep = ",")
# Year Winner Loser
#1 2000 Royals Mets
#2 2001 Giants Royals
#3 2002 RedSox Cardinals
#4 2003 Giants Tigers
#5 2004 Cardinals Rangers
数据
df2 <- structure(list(Year = 2000:2004, Results = c("Royals 4, Mets 1",
"Giants 4, Royals 3", "Red Sox 4, Cardinals 2", "Giants 4, Tigers 0",
"Cardinals 4, Rangers 3")), .Names = c("Year", "Results"), row.names = c(NA,
-5L), class = "data.frame")
# Year Results
#1 2000 Royals 4, Mets 1
#2 2001 Giants 4, Royals 3
#3 2002 Red Sox 4, Cardinals 2
#4 2003 Giants 4, Tigers 0
#5 2004 Cardinals 4, Rangers 3
您也可以通过使用 tidyr::extract
(使用@jazzurros 数据)
extract(df2,
"Results", c("Winner", "Loser"),
"([[:alpha:] ]+)\s+\d+,\s+([[:alpha:] ]+)\s+\d+")
# Year Winner Loser
# 1 2000 Royals Mets
# 2 2001 Giants Royals
# 3 2002 Red Sox Cardinals
# 4 2003 Giants Tigers
# 5 2004 Cardinals Rangers