Y_train symbolicRegressor 的值

Y_train values for symbolicRegressor

我将我的数据集拆分为 X_train、Y_train、X_test 和 Y_test，然后我使用了 symbolicRegressor...

我已经将 Dataframe 中的字符串值转换为浮点值。但是通过应用 symbolicRegressor 我得到这个错误：

ValueError: could not convert string to float: 'd'

其中 'd' 是来自 Y 的值。

因为我在 Y_train 和 Y_test 中的所有值都是字母字符，因为它们是 "labels"，我不明白为什么 symbolicRegressor 试图获得一个浮点数数..

有什么想法吗？

根据 https://gplearn.readthedocs.io/en/stable/index.html - "Symbolic regression is a machine learning technique that aims to identify an underlying mathematical expression that best describes a relationship"。注意mathematical。我不擅长这个问题的主题，gplearn的描述没有明确定义适用范围/限制。

但是，根据源代码 https://gplearn.readthedocs.io/en/stable/_modules/gplearn/genetic.html BaseSymbolic class 的方法 fit() 包含行 X, y = check_X_y(X, y, y_numeric=True) 其中 check_X_y() 是 sklearn.utils.validation.check_X_y()。参数 y_numeris 表示："Whether to ensure that y has a numeric type. If dtype of y is object, it is converted to float64. Should only be used for regression algorithms".

因此 y 值必须是数字。

抱歉重播晚了。 gplearn 支持使用 SymbolicRegressor 估计器进行回归（数字 y），并且在新发布的 gplearn 0.4.0 中，我们还支持使用 SymbolicClassifier 的二元分类（y 中的两个标签）。不过从事情的声音来看，你有一个 gplearn 当前不支持的多标签问题。这可能是我们希望在未来支持的东西。

Y_train symbolicRegressor 的值

Y_train values for symbolicRegressor

genetic

python-3.x

gplearn