在 pyspark 中读取 svm 模型时出现问题

Problem while reading svm model in pyspark

我是 pyspark 的新手,我刚刚将我的 LinearSVC 模型保存在名为“svm.model”的文件夹中。我有 2 个文件夹:数据和元数据。

现在我正在尝试加载模型。这是我加载模型的代码:

# Spark environment
from pyspark.sql import SparkSession
from pyspark.ml.classification import LinearSVC

spark = SparkSession.builder.getOrCreate()
# read model
lsvc = LinearSVC(maxIter=10, regParam=0.1)
samemodel = lsvc.load("svm.model/")

但是在加载模型时出现此错误:

File "C:/Users/Ayoub/PycharmProjects/sparkdemo/validation.py", line 9, in <module>
    samemodel = lsvc.load("svm.model/")
  File "E:\spark-3.0.1-bin-hadoop2.7\python\pyspark\ml\util.py", line 330, in load
    return cls.read().load(path)
  File "E:\spark-3.0.1-bin-hadoop2.7\python\pyspark\ml\util.py", line 280, in load
    java_obj = self._jread.load(path)
  File "E:\spark-3.0.1-bin-hadoop2.7\python\lib\py4j-0.10.9-src.zip\py4j\java_gateway.py", line 1305, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "E:\spark-3.0.1-bin-hadoop2.7\python\pyspark\sql\utils.py", line 128, in deco
    return f(*a, **kw)
  File "E:\spark-3.0.1-bin-hadoop2.7\python\lib\py4j-0.10.9-src.zip\py4j\protocol.py", line 328, in get_return_value
    format(target_id, ".", name), value)
Py4JJavaError: An error occurred while calling o24.load.
: java.lang.NoSuchMethodException: org.apache.spark.ml.classification.LinearSVCModel.<init>(java.lang.String)
    at java.lang.Class.getConstructor0(Unknown Source)
    at java.lang.Class.getConstructor(Unknown Source)
    at org.apache.spark.ml.util.DefaultParamsReader.load(ReadWrite.scala:468)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:282)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:238)
    at java.lang.Thread.run(Unknown Source)
20/11/19 13:22:31 WARN ProcfsMetricsGetter: Exception when trying to compute pagesize, as a result reporting of ProcessTree metrics is stopped

我不确定这是什么意思,这是我第一次尝试使用 pyspark 保存和加载模型。我想知道我的模型文件夹“svm.model”或我的加载方法是否有问题...!?

我使用了错误的 class 来加载模块。以下代码有效:

from pyspark.ml.classification import LinearSVCModel

samemodel = LinearSVCModel.load(model_path)

所以 train 我们使用 LinearSVC 模型,load 我们使用线性SVC模型