dcast 函数从两个值变量中获取参数
dcast function taking arguments from two value variables
假设我有一个具有以下结构的示例数据框
cars=c("A","A","A","A", "B","B","B","B", "C","C","C","C","A","A","A","A", "B","B","B","B", "C","C","C","C")
vendor=c("d","e","f","g", "d","e","f","g", "d","e","f","g", "d","e","f","g", "d","e","f","g", "d","e","f","g")
state=c(1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2)
PS_mean=c(100, 110, 120, 130, 90, 95, 140, 180, 70, 80, 120, 150, 100, 110, 120, 130, 90, 95, 140, 180, 70, 80, 120, 150)
PS_stdv=c(10, 20, 30, 40, 10, 20, 30, 40, 10, 20, 30, 40, 10, 20, 30, 40, 10, 20, 30, 40, 10, 20, 30, 40)
mycars=data.frame(cars, vendor, state, PS_mean, PS_stdv)
我现在想像
一样使用 dcast 应用重塑
mycars_cov<-dcast(setDT(mycars[c('cars','state','PS_mean','PS_stdv')]), cars~state, value.var=c("PS_mean", "PS_stdv"), car_PS_var("PS_mean", "PS_stdv"))
如您所见,函数 "car_PS_var" 是用户定义的,有两个输入
car_PS_var<- function(x,y){
x<-as.numeric(x)
y<-as.numeric(y)
z=sd(x)*sd(y)/mean(x)
return(z)
}
我不知道如何应用以两个 "value.var" 和一个 return 作为参数的函数。通常使用 dcast 你只能将一个函数应用于一个变量,这就是为什么 car_PS_var("PS_mean", "PS_stdv")
不起作用
在这种形式下,R 会抛出一些错误,因为它不能在 dcast 函数中接受两个输入。
那么我怎样才能正确地做到这一点呢?如果您建议使用任何其他 R 方法来完成任务,也可以
不确定我是否理解你的目标,但从我的解释来看,一种快速而肮脏的方法是先按汽车和状态分组,创建新列,然后 dcast 新数据 table
mycars <- as.data.table(mycars)
temp <- mycars[, .(z = car_PS_var(PS_mean, PS_stdv)),
by = c("cars", "state")]
dcast(temp, cars ~ state)
cars 1 2
1: A 1.449275 1.449275
2: B 4.325825 4.325825
3: C 4.545340 4.545340
假设我有一个具有以下结构的示例数据框
cars=c("A","A","A","A", "B","B","B","B", "C","C","C","C","A","A","A","A", "B","B","B","B", "C","C","C","C")
vendor=c("d","e","f","g", "d","e","f","g", "d","e","f","g", "d","e","f","g", "d","e","f","g", "d","e","f","g")
state=c(1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2)
PS_mean=c(100, 110, 120, 130, 90, 95, 140, 180, 70, 80, 120, 150, 100, 110, 120, 130, 90, 95, 140, 180, 70, 80, 120, 150)
PS_stdv=c(10, 20, 30, 40, 10, 20, 30, 40, 10, 20, 30, 40, 10, 20, 30, 40, 10, 20, 30, 40, 10, 20, 30, 40)
mycars=data.frame(cars, vendor, state, PS_mean, PS_stdv)
我现在想像
一样使用 dcast 应用重塑mycars_cov<-dcast(setDT(mycars[c('cars','state','PS_mean','PS_stdv')]), cars~state, value.var=c("PS_mean", "PS_stdv"), car_PS_var("PS_mean", "PS_stdv"))
如您所见,函数 "car_PS_var" 是用户定义的,有两个输入
car_PS_var<- function(x,y){
x<-as.numeric(x)
y<-as.numeric(y)
z=sd(x)*sd(y)/mean(x)
return(z)
}
我不知道如何应用以两个 "value.var" 和一个 return 作为参数的函数。通常使用 dcast 你只能将一个函数应用于一个变量,这就是为什么 car_PS_var("PS_mean", "PS_stdv")
不起作用
在这种形式下,R 会抛出一些错误,因为它不能在 dcast 函数中接受两个输入。
那么我怎样才能正确地做到这一点呢?如果您建议使用任何其他 R 方法来完成任务,也可以
不确定我是否理解你的目标,但从我的解释来看,一种快速而肮脏的方法是先按汽车和状态分组,创建新列,然后 dcast 新数据 table
mycars <- as.data.table(mycars)
temp <- mycars[, .(z = car_PS_var(PS_mean, PS_stdv)),
by = c("cars", "state")]
dcast(temp, cars ~ state)
cars 1 2
1: A 1.449275 1.449275
2: B 4.325825 4.325825
3: C 4.545340 4.545340