如何使用R包Quadprog求解SVM?
How to use R package Quadprog to solve SVM?
我想知道实施 Quadprog 来解决二次规划的正确方法是什么。
我有以下问题(从互联网上截取)并且也在查看以下内容 http://cbio.ensmp.fr/~thocking/mines-course/2011-04-01-svm/svm-qp.pdf
解决这个问题的正确方法是什么?如果我遇到上述问题,本教程是否有助于解决问题?
http://www.r-bloggers.com/solving-quadratic-progams-with-rs-quadprog-package/
这是线性 C-SVM 的一个实现,它基于原始优化问题:
min_{beta_0, beta, zeta} 1/2 w^T w + C sum_{i = 1}^N zeta_i
subject to:
y_i (w^T x_i + b) >= 1 - zeta_i, for all i = 1, 2, ..., N
zeta_i >= 0, for all i = 1, 2, ..., N
其中 N
是数据点的数量。
请注意,使用 quadprog
来解决这个问题在某种程度上更像是一种教学练习,因为 quadprog
依赖于内点算法,而在实践中将使用专门的算法,例如 Platt 的 SMO,它利用了 SVM 优化问题的特定属性。
为了使用 quadprog
,给出上面的等式,归结为设置指定优化问题的矩阵和向量。
然而,一个问题是 quadprog
要求出现在二次函数中的矩阵是正定的(参见,例如,http://www.r-bloggers.com/more-on-quadratic-progamming-in-r/),而这里使用的实现导致它是半正定的,因为截距 beta_0
和 zeta_i
没有出现在二次函数中。为了解决这个问题,我将矩阵中对应于这些值的对角线元素设置为一个非常小的值。
要设置示例代码,使用 spam
数据集,二元分类问题:
library(kernlab) # for the spam data
# Load the input data to be used
data(spam)
# Use only a subset of the data (20%)
spam <- spam[sample(nrow(spam), round(0.2 * nrow(spam)), replace = FALSE), ]
# Retrieve the features and data
X <- spam[, 1:(ncol(spam) - 1)]
Y_f <- spam[, ncol(spam)]
Y <- 2 * (as.numeric(Y_f) - 1.5) # {-1, 1}
# Sample size
N <- nrow(X)
# Number of dimensions
n_d <- ncol(X)
# Value of the regularization parameter
C <- 1
为了设置优化问题,请记住包 quadprog
:
使用的格式
#
# Formulation: min(−d^T * b + 0.5 * b^T * D * b) with the constraints A^T * b >= b_0
#
# solve.QP(Dmat, dvec, Amat, bvec, meq=0, factorized=FALSE)
#
# Arguments
# Dmat: matrix appearing in the quadratic function to be minimized.
# dvec: vector appearing in the quadratic function to be minimized.
# Amat: matrix defining the constraints under which we want to minimize the quadratic function.
# bvec: vector holding the values of b0 (defaults to zero).
# meq: the first meq constraints are treated as equality constraints, all further as inequality
# constraints (defaults to 0).
# factorized logical flag: if TRUE, then we are passing R−1 (where D = RT R) instead of the
# matrix D in the argument Dmat.
#
然后,组织参数向量为:
# b = (beta_0, beta, zeta),
# where: beta_0 in R, beta in Re^n_d, zeta in Re^N
这样:
d <- c(0, rep(0, n_d), rep(-C, N)) # -C * sum(zeta)
# Need a work-around for the matrix D, which must be positive definite (being
# positive semi-definite is not enough...)
# See http://www.r-bloggers.com/more-on-quadratic-progamming-in-r/
eps <- 1e-10 # this will ultimately be the lowest eigenvalue of matrix D (with multiplicity N + 1)
D <- diag(c(eps, rep(1, n_d), rep(eps, N))) # beta^T * beta
#
# Matrix specifying the constraints
# For zeta_i > 0:
# beta_0 | beta | zeta
# A_1 = [ 0, 0, 0, ..., 0, 1, 0, 0, ..., 0]
# [ 0, 0, 0, ..., 0, 0, 1, 0, ..., 0]
# [ 0, 0, 0, ..., 0, 0, 0, 1, ..., 0]
# ...
# [ 0, 0, 0, ..., 0, 0, 0, 0, ..., 1]
# where matrix A_1 has N rows, and N + n_d + 1 columns
#
# For beta_0 * y_i + beta^T * x_i * y_i + zeta_i >= 1:
# beta_0 | beta | zeta
# A_2 = [ y_1, y_1 * x_{1, 1}, y_1 * x_{2, 2}, ..., y_1 * x{i, n_d}, 1, 0, 0, ..., 0]
# [ y_2, y_2 * x_{2, 1}, y_2 * x_{2, 2}, ..., y_2 * x{i, n_d}, 0, 1, 0, ..., 0]
# ...
# [ y_N, y_N * x_{N, 1}, y_2 * x_{N, 2}, ..., y_N * x{N, n_d}, 0, 0, 0, ..., 1]
#
I_N <- diag(N) # N x N identity matrix
A_1 <- cbind(matrix(0, ncol = n_d + 1, nrow = N), I_N) # zeta_i > 0, for all i; N rows
A_2 <- as.matrix(cbind(as.matrix(Y), X * as.matrix(Y)[, rep(1, n_d)], I_N)) # zeta_i + beta_0 * y_i + beta^T * x_i * y_i >= 1, for all i; N rows
rownames(A_1) <- NULL; rownames(A_2) <- NULL
colnames(A_1) <- NULL; colnames(A_2) <- NULL
A <- t(rbind(A_1, A_2))
b_0 <- c(rep(0, N), rep(1, N))
最后,解决优化问题并检索参数值:
library(quadprog)
results <- solve.QP(D, d, A, b_0)
# Retrieve the results
b_optim <- results$solution
beta_0 <- b_optim[1]
beta <- b_optim[1 + (1:n_d)]
zeta <- b_optim[(n_d + 1) + (1:N)]
之后,给定一个矩阵 X_test
,该模型可用于通过以下方式进行预测:
Y_pred <- sign(apply(X_test, 1, function(x) beta_0 + sum(beta * as.vector(x))))
我想知道实施 Quadprog 来解决二次规划的正确方法是什么。
我有以下问题(从互联网上截取)并且也在查看以下内容 http://cbio.ensmp.fr/~thocking/mines-course/2011-04-01-svm/svm-qp.pdf
解决这个问题的正确方法是什么?如果我遇到上述问题,本教程是否有助于解决问题? http://www.r-bloggers.com/solving-quadratic-progams-with-rs-quadprog-package/
这是线性 C-SVM 的一个实现,它基于原始优化问题:
min_{beta_0, beta, zeta} 1/2 w^T w + C sum_{i = 1}^N zeta_i
subject to:
y_i (w^T x_i + b) >= 1 - zeta_i, for all i = 1, 2, ..., N
zeta_i >= 0, for all i = 1, 2, ..., N
其中 N
是数据点的数量。
请注意,使用 quadprog
来解决这个问题在某种程度上更像是一种教学练习,因为 quadprog
依赖于内点算法,而在实践中将使用专门的算法,例如 Platt 的 SMO,它利用了 SVM 优化问题的特定属性。
为了使用 quadprog
,给出上面的等式,归结为设置指定优化问题的矩阵和向量。
然而,一个问题是 quadprog
要求出现在二次函数中的矩阵是正定的(参见,例如,http://www.r-bloggers.com/more-on-quadratic-progamming-in-r/),而这里使用的实现导致它是半正定的,因为截距 beta_0
和 zeta_i
没有出现在二次函数中。为了解决这个问题,我将矩阵中对应于这些值的对角线元素设置为一个非常小的值。
要设置示例代码,使用 spam
数据集,二元分类问题:
library(kernlab) # for the spam data
# Load the input data to be used
data(spam)
# Use only a subset of the data (20%)
spam <- spam[sample(nrow(spam), round(0.2 * nrow(spam)), replace = FALSE), ]
# Retrieve the features and data
X <- spam[, 1:(ncol(spam) - 1)]
Y_f <- spam[, ncol(spam)]
Y <- 2 * (as.numeric(Y_f) - 1.5) # {-1, 1}
# Sample size
N <- nrow(X)
# Number of dimensions
n_d <- ncol(X)
# Value of the regularization parameter
C <- 1
为了设置优化问题,请记住包 quadprog
:
#
# Formulation: min(−d^T * b + 0.5 * b^T * D * b) with the constraints A^T * b >= b_0
#
# solve.QP(Dmat, dvec, Amat, bvec, meq=0, factorized=FALSE)
#
# Arguments
# Dmat: matrix appearing in the quadratic function to be minimized.
# dvec: vector appearing in the quadratic function to be minimized.
# Amat: matrix defining the constraints under which we want to minimize the quadratic function.
# bvec: vector holding the values of b0 (defaults to zero).
# meq: the first meq constraints are treated as equality constraints, all further as inequality
# constraints (defaults to 0).
# factorized logical flag: if TRUE, then we are passing R−1 (where D = RT R) instead of the
# matrix D in the argument Dmat.
#
然后,组织参数向量为:
# b = (beta_0, beta, zeta),
# where: beta_0 in R, beta in Re^n_d, zeta in Re^N
这样:
d <- c(0, rep(0, n_d), rep(-C, N)) # -C * sum(zeta)
# Need a work-around for the matrix D, which must be positive definite (being
# positive semi-definite is not enough...)
# See http://www.r-bloggers.com/more-on-quadratic-progamming-in-r/
eps <- 1e-10 # this will ultimately be the lowest eigenvalue of matrix D (with multiplicity N + 1)
D <- diag(c(eps, rep(1, n_d), rep(eps, N))) # beta^T * beta
#
# Matrix specifying the constraints
# For zeta_i > 0:
# beta_0 | beta | zeta
# A_1 = [ 0, 0, 0, ..., 0, 1, 0, 0, ..., 0]
# [ 0, 0, 0, ..., 0, 0, 1, 0, ..., 0]
# [ 0, 0, 0, ..., 0, 0, 0, 1, ..., 0]
# ...
# [ 0, 0, 0, ..., 0, 0, 0, 0, ..., 1]
# where matrix A_1 has N rows, and N + n_d + 1 columns
#
# For beta_0 * y_i + beta^T * x_i * y_i + zeta_i >= 1:
# beta_0 | beta | zeta
# A_2 = [ y_1, y_1 * x_{1, 1}, y_1 * x_{2, 2}, ..., y_1 * x{i, n_d}, 1, 0, 0, ..., 0]
# [ y_2, y_2 * x_{2, 1}, y_2 * x_{2, 2}, ..., y_2 * x{i, n_d}, 0, 1, 0, ..., 0]
# ...
# [ y_N, y_N * x_{N, 1}, y_2 * x_{N, 2}, ..., y_N * x{N, n_d}, 0, 0, 0, ..., 1]
#
I_N <- diag(N) # N x N identity matrix
A_1 <- cbind(matrix(0, ncol = n_d + 1, nrow = N), I_N) # zeta_i > 0, for all i; N rows
A_2 <- as.matrix(cbind(as.matrix(Y), X * as.matrix(Y)[, rep(1, n_d)], I_N)) # zeta_i + beta_0 * y_i + beta^T * x_i * y_i >= 1, for all i; N rows
rownames(A_1) <- NULL; rownames(A_2) <- NULL
colnames(A_1) <- NULL; colnames(A_2) <- NULL
A <- t(rbind(A_1, A_2))
b_0 <- c(rep(0, N), rep(1, N))
最后,解决优化问题并检索参数值:
library(quadprog)
results <- solve.QP(D, d, A, b_0)
# Retrieve the results
b_optim <- results$solution
beta_0 <- b_optim[1]
beta <- b_optim[1 + (1:n_d)]
zeta <- b_optim[(n_d + 1) + (1:N)]
之后,给定一个矩阵 X_test
,该模型可用于通过以下方式进行预测:
Y_pred <- sign(apply(X_test, 1, function(x) beta_0 + sum(beta * as.vector(x))))