在 R 中使用函数和 mapply 创建新列以求和其他列
Using a function and mapply in R to create new columns that sums other columns
假设,我有一个数据框 df,我想在添加两个现有列“a”和“b”的基础上创建一个名为“c”的新列。我只会 运行 以下代码:
df$c <- df$a + df$b
但我也想对许多其他专栏执行此操作。那么为什么我的下面的代码不起作用?
# Reproducible data:
martial_arts <- data.frame(gym_branch=c("downtown_a", "downtown_b", "uptown", "island"),
day_boxing=c(5,30,25,10),day_muaythai=c(34,18,20,30),
day_bjj=c(0,0,0,0),day_judo=c(10,0,5,0),
evening_boxing=c(50,45,32,40), evening_muaythai=c(50,50,45,50),
evening_bjj=c(60,60,55,40), evening_judo=c(25,15,30,0))
# Creating a list of the new column names of the columns that need to be added to the martial_arts dataframe:
pattern<-c("_boxing","_muaythai","_bjj","_judo")
d<- expand.grid(paste0("martial_arts$total",pattern))
# Creating lists of the columns that will be added to each other:
e<- names(martial_arts %>% select(day_boxing:day_judo))
f<- names(martial_arts %>% select(evening_boxing:evening_judo))
# Writing a function and using mapply:
kick_him <- function(d,e,f){d <- rowSums(martial_arts[ , c(e, f)], na.rm=T)}
mapply(kick_him,d,e,f)
现在,mapply 根据加法生成正确的结果:
> mapply(ff,d,e,f)
Var1 <NA> <NA> <NA>
[1,] 55 84 60 35
[2,] 75 68 60 15
[3,] 57 65 55 35
[4,] 50 80 40 0
但它不会将新列添加到 martial_arts 数据框。理论上的功能应该做以下事情
martial_arts$total_boxing <- martial_arts$day_boxing + martial_arts$evening_boxing
...
...
martial_arts$total_judo <- martial_arts$day_judo + martial_arts$evening_judo
并将四个新的总计列添加到 martial_arts。
那我做错了什么?
这里的赋值是错误的,即 martial_arts$total_boxing
不是一个字符串,它应该是单独的“total_boxing”,并且它应该在 Map/mapply
的 lhs 上。由于 OP 已经在 'd' 数据集中创建了 'martial_arts$' 作为列,我们将删除前缀部分并进行赋值
kick_him <- function(e,f){rowSums(martial_arts[ , c(e, f)], na.rm=TRUE)}
martial_arts[sub(".*\$", "", d$Var1)] <- Map(kick_him, e, f)
-现在检查数据集
> martial_arts
gym_branch day_boxing day_muaythai day_bjj day_judo evening_boxing evening_muaythai evening_bjj evening_judo total_boxing total_muaythai total_bjj total_judo
1 downtown_a 5 34 0 10 50 50 60 25 55 84 60 35
2 downtown_b 30 18 0 0 45 50 60 15 75 68 60 15
3 uptown 25 20 0 5 32 45 55 30 57 65 55 35
4 island 10 30 0 0 40 50 40 0 50 80 40 0
假设,我有一个数据框 df,我想在添加两个现有列“a”和“b”的基础上创建一个名为“c”的新列。我只会 运行 以下代码:
df$c <- df$a + df$b
但我也想对许多其他专栏执行此操作。那么为什么我的下面的代码不起作用?
# Reproducible data:
martial_arts <- data.frame(gym_branch=c("downtown_a", "downtown_b", "uptown", "island"),
day_boxing=c(5,30,25,10),day_muaythai=c(34,18,20,30),
day_bjj=c(0,0,0,0),day_judo=c(10,0,5,0),
evening_boxing=c(50,45,32,40), evening_muaythai=c(50,50,45,50),
evening_bjj=c(60,60,55,40), evening_judo=c(25,15,30,0))
# Creating a list of the new column names of the columns that need to be added to the martial_arts dataframe:
pattern<-c("_boxing","_muaythai","_bjj","_judo")
d<- expand.grid(paste0("martial_arts$total",pattern))
# Creating lists of the columns that will be added to each other:
e<- names(martial_arts %>% select(day_boxing:day_judo))
f<- names(martial_arts %>% select(evening_boxing:evening_judo))
# Writing a function and using mapply:
kick_him <- function(d,e,f){d <- rowSums(martial_arts[ , c(e, f)], na.rm=T)}
mapply(kick_him,d,e,f)
现在,mapply 根据加法生成正确的结果:
> mapply(ff,d,e,f)
Var1 <NA> <NA> <NA>
[1,] 55 84 60 35
[2,] 75 68 60 15
[3,] 57 65 55 35
[4,] 50 80 40 0
但它不会将新列添加到 martial_arts 数据框。理论上的功能应该做以下事情
martial_arts$total_boxing <- martial_arts$day_boxing + martial_arts$evening_boxing
...
...
martial_arts$total_judo <- martial_arts$day_judo + martial_arts$evening_judo
并将四个新的总计列添加到 martial_arts。
那我做错了什么?
这里的赋值是错误的,即 martial_arts$total_boxing
不是一个字符串,它应该是单独的“total_boxing”,并且它应该在 Map/mapply
的 lhs 上。由于 OP 已经在 'd' 数据集中创建了 'martial_arts$' 作为列,我们将删除前缀部分并进行赋值
kick_him <- function(e,f){rowSums(martial_arts[ , c(e, f)], na.rm=TRUE)}
martial_arts[sub(".*\$", "", d$Var1)] <- Map(kick_him, e, f)
-现在检查数据集
> martial_arts
gym_branch day_boxing day_muaythai day_bjj day_judo evening_boxing evening_muaythai evening_bjj evening_judo total_boxing total_muaythai total_bjj total_judo
1 downtown_a 5 34 0 10 50 50 60 25 55 84 60 35
2 downtown_b 30 18 0 0 45 50 60 15 75 68 60 15
3 uptown 25 20 0 5 32 45 55 30 57 65 55 35
4 island 10 30 0 0 40 50 40 0 50 80 40 0