我应该如何使用 julia 训练 SVM?

How should I train a SVM using julia?

有没有人有在 Julia (1.4.1) 中训练支持向量机 (SVM) 的经验?

我尝试了LIBSVM接口,但是gituhub页面上的例子报错了:

# Load Fisher's classic iris data
iris = dataset("datasets", "iris")
# LIBSVM handles multi-class data automatically using a one-against-one strategy
labels = convert(Vector, iris[:Species])
# First dimension of input data is features; second is instances
instances = convert(Array, iris[:, 1:4])'
# Train SVM on half of the data using default parameters. See documentation
# of svmtrain for options
model = svmtrain(instances[:, 1:2:end], labels[1:2:end]);```

ERROR: MethodError: no method matching LIBSVM.SupportVectors(::Int32, ::Array{Int32,1}, ::CategoricalArray{String,1,UInt8,String,CategoricalValue{String,UInt8},Union{}}, ::Array{Float64,2}, ::Array{Int32,1}, ::Array{LIBSVM.SVMNode,1})
Closest candidates are:
LIBSVM.SupportVectors(::Int32, ::Array{Int32,1}, ::Array{T,1}, ::AbstractArray{U,2}, ::Array{Int32,1}, ::Array{LIBSVM.SVMNode,1}) where {T, U} at /home/benny/.julia/packages/LIBSVM/5Z99T/src/LIBSVM.jl:18
LIBSVM.SupportVectors(::LIBSVM.SVMModel, ::Any, ::Any) at /home/benny/.julia/packages/LIBSVM/5Z99T/src/LIBSVM.jl:27 

看起来 LIBSVM.jl 文档相当过时并且包没有适当更新,因此值得一提(或者至少请求更新 README)。

您看到的错误与包本身无关,但事实上 DataFrames.jlRDatasets.jl labels 列的当前版本不再是 Vector (就像开发 LIBSVM.jl 时那样)但是 CategoricalArray.您可以通过将 CategoricalArray 转换为通常的 Vector{String} 来避免此问题。完整示例如下所示

using RDatasets, LIBSVM
using StatsBase, Printf # `mean` and `printf` are no longer in Base, and should be used explicitly

# Load Fisher's classic iris data
iris = dataset("datasets", "iris")

# LIBSVM handles multi-class data automatically using a one-against-one strategy
labels = string.(convert(Vector, iris[:Species]))

# First dimension of input data is features; second is instances
instances = convert(Array, iris[:, 1:4])'

# Train SVM on half of the data using default parameters. See documentation
# of svmtrain for options
model = svmtrain(instances[:, 1:2:end], labels[1:2:end]);

# Test model on the other half of the data.
(predicted_labels, decision_values) = svmpredict(model, instances[:, 2:2:end]);

# Compute accuracy
@printf "Accuracy: %.2f%%\n" mean((predicted_labels .== labels[2:2:end]))*100

或者,您可以使用 MLJ.jl or ScikitLearn.jl 应该自己正确包装 LIBSVM.jl。

Oskin 的回答是针对旧版本的。

在当前版本中,应该修改为,

using RDatasets, LIBSVM
using StatsBase, Printf # `mean` and `printf` are no longer in Base, and should be used explicitly

# Load Fisher's classic iris data
iris = dataset("datasets", "iris")

# LIBSVM handles multi-class data automatically using a one-against-one strategy
labels = string.(convert(Vector, iris[:,:Species]))

# First dimension of input data is features; second is instances
instances = Matrix(iris[:, 1:4])'

# Train SVM on half of the data using default parameters. See documentation
# of svmtrain for options
model = svmtrain(instances[:, 1:2:end], labels[1:2:end]);

# Test model on the other half of the data.
(predicted_labels, decision_values) = svmpredict(model, instances[:, 2:2:end]);

# Compute accuracy
@printf "Accuracy: %.2f%%\n" mean((predicted_labels .== labels[2:2:end]))*100