如何为随机森林指定 minInstancesPerNode 参数?
How to specify minInstancesPerNode parameter for Random Forest?
似乎无法在 pyspark 中为随机森林指定 minInstancesPerNode
。我在 scala 代码中看不到它,但在 spark R library and in the documentation 中提到了它。它说:
minInstancesPerNode: For a node to be split further, each of its
children must receive at least this number of training instances. This
is commonly used with RandomForest since those are often trained
deeper than individual trees.
是否可以在 spark(特别是 pyspark)中将此参数用于随机森林?
根据docs,minInstancesPerNode
是pyspark.ml.classification.RandomForestClassifier
的输入参数。
您可能正在查看随机森林的另一种实现方式,也许是 mllib
实现方式。
似乎无法在 pyspark 中为随机森林指定 minInstancesPerNode
。我在 scala 代码中看不到它,但在 spark R library and in the documentation 中提到了它。它说:
minInstancesPerNode: For a node to be split further, each of its children must receive at least this number of training instances. This is commonly used with RandomForest since those are often trained deeper than individual trees.
是否可以在 spark(特别是 pyspark)中将此参数用于随机森林?
根据docs,minInstancesPerNode
是pyspark.ml.classification.RandomForestClassifier
的输入参数。
您可能正在查看随机森林的另一种实现方式,也许是 mllib
实现方式。