如何让 ksvm 在缩放训练后预测非缩放值
How to get ksvm to predict non-scaled values after scaled training
当我 运行 使用 kernlab
包中的 ksvm
的 SVM 时,我最终模型上 predict
命令的所有输出都会被缩放。我知道这是因为我发起 scaled = T
但我也知道在 SVM 建模中首选缩放数据。如何轻松判断 ksvm
到 return 非缩放预测?如果没有,有没有办法将预测的缩放值操纵为原始值?谢谢,代码如下:
svm1 <- ksvm(Y ~ 1
+ X1
+ X2
, data = data_nn
, scaled=T
, type = "eps-svr"
, kernel="anovadot"
, epsilon = svm1_CV2$bestTune$epsilon
, C = svm1_CV2$bestTune$C
, kpar = list(sigma = svm1_CV2$bestTune$sigma
, degree= svm1_CV2$bestTune$degree)
)
#Analyze Results
data_nn$svm_pred <- predict(svm1)
来自文档:
argument scaled:
A logical vector indicating the variables to be scaled. If scaled is of length 1,
the value is recycled as many times as needed and all non-binary variables are scaled.
Per default, data are scaled internally (both x and y variables) to zero mean and
unit variance. The center and scale values are returned and used for later predictions.
解决方案 1
让我们看下面的例子:
#make random data set
y <- runif(100,100,1000) #the response variable takes values between 100 and 1000
x1 <- runif(100,100,500)
x2 <- runif(100,100,500)
df <- data.frame(y,x1,x2)
正在输入:
svm1 <- ksvm( y~1+x2+x2,data=df,scaled=T,type='eps-svr',kernel='anovadot')
> predict(svm1)
[,1]
[1,] 0.290848927
[2,] -0.206473246
[3,] -0.076651875
[4,] 0.088779924
[5,] 0.036257375
[6,] 0.206106048
[7,] -0.189082081
[8,] 0.245768175
[9,] 0.206742751
[10,] -0.238471569
[11,] 0.349902743
[12,] -0.199938921
进行缩放预测。
但是如果你根据上面的文档将其更改为以下内容:
svm1 <- ksvm( y~1+x2+x2,data=df,scaled=c(F,T,T,T),type='eps-svr',kernel='anovadot')
#I am using a logical vector here so predictions will be raw data.
#only the intercept x1 and x2 will be scaled using the above.
#btw scaling the intercept (number 1 in the formula), actually eliminates the intercept.
> predict(svm1)
[,1]
[1,] 601.2630
[2,] 599.7238
[3,] 599.7287
[4,] 599.9112
[5,] 601.6950
[6,] 599.8382
[7,] 599.8623
[8,] 599.7287
[9,] 601.8496
[10,] 599.0759
[11,] 601.7348
[12,] 601.7249
如您所见,这是原始数据预测。
解决方案 2
如果您想缩放模型中的 y 变量,您需要自己取消缩放预测。
模型前:
在 运行 模型之前计算均值和标准差:
y2 <- scale(y)
y_mean <- attributes(y2)$'scaled:center' #the mean
y_std <- attributes(y2)$'scaled:scale' #the standard deviation
将预测转换为原始数据:
svm1 <- ksvm( y~1+x2+x2,data=df,scaled=T,type='eps-svr',kernel='anovadot')
> predict(svm1) * y_std + y_mean
[,1]
[1,] 654.3604
[2,] 522.3578
[3,] 556.8159
[4,] 600.7259
[5,] 586.7850
[6,] 631.8674
[7,] 526.9739
[8,] 642.3948
[9,] 632.0364
[10,] 513.8646
[11,] 670.0349
[12,] 524.0922
[13,] 673.7202
你得到了原始预测!
当我 运行 使用 kernlab
包中的 ksvm
的 SVM 时,我最终模型上 predict
命令的所有输出都会被缩放。我知道这是因为我发起 scaled = T
但我也知道在 SVM 建模中首选缩放数据。如何轻松判断 ksvm
到 return 非缩放预测?如果没有,有没有办法将预测的缩放值操纵为原始值?谢谢,代码如下:
svm1 <- ksvm(Y ~ 1
+ X1
+ X2
, data = data_nn
, scaled=T
, type = "eps-svr"
, kernel="anovadot"
, epsilon = svm1_CV2$bestTune$epsilon
, C = svm1_CV2$bestTune$C
, kpar = list(sigma = svm1_CV2$bestTune$sigma
, degree= svm1_CV2$bestTune$degree)
)
#Analyze Results
data_nn$svm_pred <- predict(svm1)
来自文档:
argument scaled:
A logical vector indicating the variables to be scaled. If scaled is of length 1,
the value is recycled as many times as needed and all non-binary variables are scaled.
Per default, data are scaled internally (both x and y variables) to zero mean and
unit variance. The center and scale values are returned and used for later predictions.
解决方案 1
让我们看下面的例子:
#make random data set
y <- runif(100,100,1000) #the response variable takes values between 100 and 1000
x1 <- runif(100,100,500)
x2 <- runif(100,100,500)
df <- data.frame(y,x1,x2)
正在输入:
svm1 <- ksvm( y~1+x2+x2,data=df,scaled=T,type='eps-svr',kernel='anovadot')
> predict(svm1)
[,1]
[1,] 0.290848927
[2,] -0.206473246
[3,] -0.076651875
[4,] 0.088779924
[5,] 0.036257375
[6,] 0.206106048
[7,] -0.189082081
[8,] 0.245768175
[9,] 0.206742751
[10,] -0.238471569
[11,] 0.349902743
[12,] -0.199938921
进行缩放预测。
但是如果你根据上面的文档将其更改为以下内容:
svm1 <- ksvm( y~1+x2+x2,data=df,scaled=c(F,T,T,T),type='eps-svr',kernel='anovadot')
#I am using a logical vector here so predictions will be raw data.
#only the intercept x1 and x2 will be scaled using the above.
#btw scaling the intercept (number 1 in the formula), actually eliminates the intercept.
> predict(svm1)
[,1]
[1,] 601.2630
[2,] 599.7238
[3,] 599.7287
[4,] 599.9112
[5,] 601.6950
[6,] 599.8382
[7,] 599.8623
[8,] 599.7287
[9,] 601.8496
[10,] 599.0759
[11,] 601.7348
[12,] 601.7249
如您所见,这是原始数据预测。
解决方案 2
如果您想缩放模型中的 y 变量,您需要自己取消缩放预测。
模型前:
在 运行 模型之前计算均值和标准差:
y2 <- scale(y)
y_mean <- attributes(y2)$'scaled:center' #the mean
y_std <- attributes(y2)$'scaled:scale' #the standard deviation
将预测转换为原始数据:
svm1 <- ksvm( y~1+x2+x2,data=df,scaled=T,type='eps-svr',kernel='anovadot')
> predict(svm1) * y_std + y_mean
[,1]
[1,] 654.3604
[2,] 522.3578
[3,] 556.8159
[4,] 600.7259
[5,] 586.7850
[6,] 631.8674
[7,] 526.9739
[8,] 642.3948
[9,] 632.0364
[10,] 513.8646
[11,] 670.0349
[12,] 524.0922
[13,] 673.7202
你得到了原始预测!