PySpark AttributeError: type object 'ALS' has no attribute 'trainImplicit'

Question

我正在尝试使用 ALS 训练我的数据集以查找潜在因子。我的数据集具有隐式评级。

在深度上，我的数据库由三列用户、项目（存储库）和评级（星数（隐式评级））组成：

from pyspark.ml.recommendation import ALS

lines = spark.read.text("Dataset.csv").rdd
parts = lines.map(lambda row: row.value.split(","))

ratingsRDD = parts.map(lambda p: Row(userId=int(p[1]),repoId=int(p[2]),repoCount=float(p[3])))
ratings = spark.createDataFrame(ratingsRDD)

model = ALS.trainImplicit(ratings, rank=5,lambda_=0.01, alpha = 1.0, iterations =5)

我收到这个错误：

AttributeError: type object 'ALS' has no attribute 'trainImplicit'

Answer 1

您正在尝试使用旧的语法，Spark MLLib ALS (which works with RDDs, and not with dataframes) with the new, Spark ML ALS, which indeed doesn't have a trainImplicit attribute (docs)。

你应该尝试这样的事情：

als = ALS(rank=5, maxIter=5, alpha = 1.0, implicitPrefs=True, seed=0)
model = als.fit(ratings)

前提是您的商品位于名为 item 的列中，您的评分位于 rating 中 - 请查看 docs 了解更多详细信息、参数化选项和示例。

PySpark AttributeError: type object 'ALS' has no attribute 'trainImplicit'

PySpark AttributeError: type object 'ALS' has no attribute 'trainImplicit'

python

machine-learning

pyspark

apache-spark-ml

apache-spark-mllib