PySpark 中是否有等同于 scikit-learn 的 sample_weight 的参数?

Is there in PySpark a parameter equivalent to scikit-learn's sample_weight?

我目前正在使用 scikit-learn 库提供的 SGDClassifier。当我使用 fit 方法时,我可以设置 sample_weight 参数:

Weights applied to individual samples. If not provided, uniform weights are assumed. These weights will be multiplied with class_weight (passed through the constructor) if class_weight is specified

我想切换到 PySpark 并使用 LogisticRegression class。无论如何,我找不到类似于 sample_weight 的参数。有一个 weightCol 参数,但我认为它做了一些不同的事情。

你有什么建议吗?

There is a weightCol parameter but I think it does something different.

相反,weightCol 的 Spark ML 正是这样做的;来自 docs(重点添加):

weightCol = Param(parent='undefined', name='weightCol', doc='weight column name. If this is not set or empty, we treat all instance weights as 1.0.')