在 r 中将一列一分为二
Split a column in two in r
我的 df 是这样的:
Time
Week End 07-01-10
Week End 07-02-10
我想要它
Column Time
Week End 07-01-10
Week End 07-02-10
我用谷歌搜索了包 stringr
会很有用,但我无法正确使用它,因为有两个空格。
您可以使用 tidyr
包中的 extract
,您可以在其中指定正则表达式来拆分列:
library(tidyr)
extract(df, Time, into = c("Column", "Time"), "(.*)\s(\S+)")
# Column Time
# 1 Week End 07-01-10
# 2 Week End 07-02-10
使用 (.*)\s(\S+)
捕获两个组并在 space 上拆分,然后是不包含 space \S+
的组。
如果你想使用stringr
包,你可以使用具有类似功能的str_match
函数:
stringr::str_match(df$Time, "(.*)\s(\S+)")[, 2:3]
# [,1] [,2]
# [1,] "Week End" "07-01-10"
# [2,] "Week End" "07-02-10"
strsplit
也可以,如果你指定space是数字前的那个,这里?=
代表向前看,\d
是数字的缩写相当于 [0-9]
:
do.call(rbind, strsplit(df$Time, "\s(?=\d)", perl = T))
# [,1] [,2]
# [1,] "Week End" "07-01-10"
# [2,] "Week End" "07-02-10"
这是一个 base-R 解决方案。
df <- data.frame(c("Week End 07-01-10", "Week End 07-02-10"),
stringsAsFactors=FALSE)
names(df) <- "Time"
# Assuming all columns end with (time?) in the same format.
df$Column <- substring(df$Time, 0, nchar(df$Time)-9)
df$Time <- substring(df$Time, nchar(df$Time)-8, nchar(df$Time))
df <- df[, c(2,1)]; df # Changing column order
我们可以使用 base R
中的 read.table
。不需要包裹
read.table(text=sub("\s+(\S+)$", ",\1", df1$Time), header=FALSE,
col.names = c("Column", "Time"), stringsAsFactors=FALSE, sep=",")
# Column Time
#1 Week End 07-01-10
#2 Week End 07-02-10
我的 df 是这样的:
Time
Week End 07-01-10
Week End 07-02-10
我想要它
Column Time
Week End 07-01-10
Week End 07-02-10
我用谷歌搜索了包 stringr
会很有用,但我无法正确使用它,因为有两个空格。
您可以使用 tidyr
包中的 extract
,您可以在其中指定正则表达式来拆分列:
library(tidyr)
extract(df, Time, into = c("Column", "Time"), "(.*)\s(\S+)")
# Column Time
# 1 Week End 07-01-10
# 2 Week End 07-02-10
使用 (.*)\s(\S+)
捕获两个组并在 space 上拆分,然后是不包含 space \S+
的组。
如果你想使用stringr
包,你可以使用具有类似功能的str_match
函数:
stringr::str_match(df$Time, "(.*)\s(\S+)")[, 2:3]
# [,1] [,2]
# [1,] "Week End" "07-01-10"
# [2,] "Week End" "07-02-10"
strsplit
也可以,如果你指定space是数字前的那个,这里?=
代表向前看,\d
是数字的缩写相当于 [0-9]
:
do.call(rbind, strsplit(df$Time, "\s(?=\d)", perl = T))
# [,1] [,2]
# [1,] "Week End" "07-01-10"
# [2,] "Week End" "07-02-10"
这是一个 base-R 解决方案。
df <- data.frame(c("Week End 07-01-10", "Week End 07-02-10"),
stringsAsFactors=FALSE)
names(df) <- "Time"
# Assuming all columns end with (time?) in the same format.
df$Column <- substring(df$Time, 0, nchar(df$Time)-9)
df$Time <- substring(df$Time, nchar(df$Time)-8, nchar(df$Time))
df <- df[, c(2,1)]; df # Changing column order
我们可以使用 base R
中的 read.table
。不需要包裹
read.table(text=sub("\s+(\S+)$", ",\1", df1$Time), header=FALSE,
col.names = c("Column", "Time"), stringsAsFactors=FALSE, sep=",")
# Column Time
#1 Week End 07-01-10
#2 Week End 07-02-10