如何编写循环或使用 lapply 在 R 中创建一些新变量
How to write a loop or using lapply to create some new variables in R
我有数据框,我想创建一些新变量并更新旧变量,但有时变量的长度太多,我不知道如何放入循环或使用 mapply 或 lapply.
df <- data.frame(x=c("A","A","A,S"),
y=c("12","12,4","10"),
z=c("String,Text","Avoid","Use"))
> df
x y z
1 A 12 String,Text
2 A 12,4 Avoid
3 A,S 10 Use
我创建了一些新变量:
df$x_sub <- substring(sub("^[^,]*", "",df$x),2)
df$x <- sub("\,.*", "",df$x)
df$y_sub <- substring(sub("^[^,]*", "",df$y),2)
df$y <- sub("\,.*", "",df$y)
df$z_sub <- substring(sub("^[^,]*", "",df$z),2)
df$z <- sub("\,.*", "",df$z)
输出是正确的,但是如果我有 10 个变量,我需要做些什么来节省我的时间
x y z x_sub y_sub z_sub
1 A 12 String Text
2 A 12 Avoid 4
3 A 10 Use S
我们可以使用 str_extract
library(stringr)
df1 <- df
df1[] <- lapply(df, function(x) type.convert(str_extract(x, "^[^,]+"), as.is = TRUE))
df1[paste0(names(df1), "_sub")] <- lapply(df, function(x)
type.convert(str_extract(x, "(?<=,)[^,]+"), as.is = TRUE))
df1
# x y z x_sub y_sub z_sub
#1 A 12 String <NA> NA Text
#2 A 12 Avoid <NA> 4 <NA>
#3 A 10 Use S NA <NA>
或者另一种选择是 cSplit
library(splitstackshape)
cSplit(df, names(df), ",")
# x_1 x_2 y_1 y_2 z_1 z_2
#1: A NA 12 NA String Text
#2: A NA 12 4 Avoid NA
#3: A S 10 NA Use NA
我有数据框,我想创建一些新变量并更新旧变量,但有时变量的长度太多,我不知道如何放入循环或使用 mapply 或 lapply.
df <- data.frame(x=c("A","A","A,S"),
y=c("12","12,4","10"),
z=c("String,Text","Avoid","Use"))
> df
x y z
1 A 12 String,Text
2 A 12,4 Avoid
3 A,S 10 Use
我创建了一些新变量:
df$x_sub <- substring(sub("^[^,]*", "",df$x),2)
df$x <- sub("\,.*", "",df$x)
df$y_sub <- substring(sub("^[^,]*", "",df$y),2)
df$y <- sub("\,.*", "",df$y)
df$z_sub <- substring(sub("^[^,]*", "",df$z),2)
df$z <- sub("\,.*", "",df$z)
输出是正确的,但是如果我有 10 个变量,我需要做些什么来节省我的时间
x y z x_sub y_sub z_sub
1 A 12 String Text
2 A 12 Avoid 4
3 A 10 Use S
我们可以使用 str_extract
library(stringr)
df1 <- df
df1[] <- lapply(df, function(x) type.convert(str_extract(x, "^[^,]+"), as.is = TRUE))
df1[paste0(names(df1), "_sub")] <- lapply(df, function(x)
type.convert(str_extract(x, "(?<=,)[^,]+"), as.is = TRUE))
df1
# x y z x_sub y_sub z_sub
#1 A 12 String <NA> NA Text
#2 A 12 Avoid <NA> 4 <NA>
#3 A 10 Use S NA <NA>
或者另一种选择是 cSplit
library(splitstackshape)
cSplit(df, names(df), ",")
# x_1 x_2 y_1 y_2 z_1 z_2
#1: A NA 12 NA String Text
#2: A NA 12 4 Avoid NA
#3: A S 10 NA Use NA