决策树 - 边缘/分支非常轻,不可见
Decision Tree - Edges / Branches so Light that are invisible
我正在使用经典的泰坦尼克号数据集来构建决策树。但是,我不确定几乎看不见的边缘或树枝出了什么问题。
这里是构建决策树的代码
# Plant a new pruned tree
ideal_dt = DecisionTreeClassifier(random_state=6, ccp_alpha=optimal_alpha)
ideal_dt = ideal_dt.fit(X_train, y_train)
# Plot the confusion matrix
plot_confusion_matrix(ideal_dt,X_test,y_test,display_labels=['Not Survived','Survived'])
plt.grid(False);
# Plot the tree
plt.figure(figsize=(200,180))
plot_tree(ideal_dt,filled=True,rounded=True, fontsize=120, class_names=labels,feature_names=data_features.columns);
print('\nIdeal Decision Tree')
# Training Score
print('Training Set Accuracy:',ideal_dt.score(X_train,y_train))
# Testing Score
print('Testing Set Accuracy:',ideal_dt.score(X_test,y_test))
设置如下:
# Basic Import
import pandas as pd
import numpy as np
import seaborn as sns
import random
import matplotlib.pyplot as plt
# Hypothesis Testing
from scipy.stats import ttest_ind, ttest_rel, ttest_1samp
# Machine Learning Import
import sklearn as skl
from sklearn import datasets
# Preprocessing
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split, cross_val_score
# Linear Regression
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import Ridge
# KNN Classification
from sklearn.neighbors import KNeighborsClassifier, KNeighborsRegressor
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import scale
from sklearn.metrics import confusion_matrix
from sklearn.metrics import plot_confusion_matrix
from sklearn.metrics import f1_score
from sklearn.decomposition import PCA
from sklearn.model_selection import GridSearchCV
# K-means clustering
from sklearn.cluster import KMeans
# Logistic Regression
from sklearn.linear_model import LogisticRegression
# Decision Tree
from sklearn.tree import DecisionTreeClassifier
from sklearn.tree import DecisionTreeRegressor
from sklearn.tree import plot_tree
from sklearn.model_selection import cross_val_score
# Database Import
import sqlite3
from sqlite3 import Error
# Measure Performance
from sklearn.metrics import make_scorer, accuracy_score, r2_score, mean_squared_error
import sklearn.metrics as skm
from sklearn.metrics import classification_report
from sklearn.tree import DecisionTreeClassifier
# plt.style.use('seaborn-notebook')
## inline figures
%matplotlib inline
plt.style.use('seaborn')
## just to make sure few warnings are not shown
import warnings
warnings.filterwarnings("ignore")
我试过注释掉 plt.style.use('seaborn')
但没有成功。如有任何建议,我们将不胜感激
plot_tree()
returns 艺术家列表(Annotations). You can access the arrow and change their properties in a loop. Refer to https://matplotlib.org/api/_as_gen/matplotlib.patches.FancyArrowPatch.html#matplotlib.patches.FancyArrowPatch 列表是您可以更改的属性列表。
我不知道为什么箭头没有出现在你的案例中,但我会从调整它们的颜色和宽度开始。
from matplotlib import pyplot as plt
from sklearn.datasets import load_iris
from sklearn import tree
clf = tree.DecisionTreeClassifier(random_state=0)
iris = load_iris()
clf = clf.fit(iris.data, iris.target)
fig, ax = plt.subplots(figsize=(10,10))
out = tree.plot_tree(clf)
for o in out:
arrow = o.arrow_patch
if arrow is not None:
arrow.set_edgecolor('red')
arrow.set_linewidth(3)
我在使用 Jupyter Notebook 时遇到了同样的问题。重新启动内核并再次 运行ning 代码,为我显示了箭头。我认为这是一个编辑器问题。或者,您可以 运行 另一个环境中的代码(例如 VScode)并查看差异。
遇到了同样的问题,而且也没有像您自己提到的那样使用 sns
。但是有人指出这是一个潜在的问题,所以我 从那个 virtualenv 中卸载了 seaborn 并解决了这个问题。
我不知道为什么会这样;也许我导入的库之一正在以某种方式导入 seaborn。
对我来说,我使用的是 matplotlib 的 fivethirtyeight
主题。将其注释掉,然后重新启动内核。这显示了预期的箭头并且非常清晰。
我在使用 seaborn 进行其他绘图时遇到了类似的问题。我用了 sns.reset_defaults()
问题解决了。
我正在使用经典的泰坦尼克号数据集来构建决策树。但是,我不确定几乎看不见的边缘或树枝出了什么问题。
这里是构建决策树的代码
# Plant a new pruned tree
ideal_dt = DecisionTreeClassifier(random_state=6, ccp_alpha=optimal_alpha)
ideal_dt = ideal_dt.fit(X_train, y_train)
# Plot the confusion matrix
plot_confusion_matrix(ideal_dt,X_test,y_test,display_labels=['Not Survived','Survived'])
plt.grid(False);
# Plot the tree
plt.figure(figsize=(200,180))
plot_tree(ideal_dt,filled=True,rounded=True, fontsize=120, class_names=labels,feature_names=data_features.columns);
print('\nIdeal Decision Tree')
# Training Score
print('Training Set Accuracy:',ideal_dt.score(X_train,y_train))
# Testing Score
print('Testing Set Accuracy:',ideal_dt.score(X_test,y_test))
设置如下:
# Basic Import
import pandas as pd
import numpy as np
import seaborn as sns
import random
import matplotlib.pyplot as plt
# Hypothesis Testing
from scipy.stats import ttest_ind, ttest_rel, ttest_1samp
# Machine Learning Import
import sklearn as skl
from sklearn import datasets
# Preprocessing
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split, cross_val_score
# Linear Regression
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import Ridge
# KNN Classification
from sklearn.neighbors import KNeighborsClassifier, KNeighborsRegressor
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import scale
from sklearn.metrics import confusion_matrix
from sklearn.metrics import plot_confusion_matrix
from sklearn.metrics import f1_score
from sklearn.decomposition import PCA
from sklearn.model_selection import GridSearchCV
# K-means clustering
from sklearn.cluster import KMeans
# Logistic Regression
from sklearn.linear_model import LogisticRegression
# Decision Tree
from sklearn.tree import DecisionTreeClassifier
from sklearn.tree import DecisionTreeRegressor
from sklearn.tree import plot_tree
from sklearn.model_selection import cross_val_score
# Database Import
import sqlite3
from sqlite3 import Error
# Measure Performance
from sklearn.metrics import make_scorer, accuracy_score, r2_score, mean_squared_error
import sklearn.metrics as skm
from sklearn.metrics import classification_report
from sklearn.tree import DecisionTreeClassifier
# plt.style.use('seaborn-notebook')
## inline figures
%matplotlib inline
plt.style.use('seaborn')
## just to make sure few warnings are not shown
import warnings
warnings.filterwarnings("ignore")
我试过注释掉 plt.style.use('seaborn')
但没有成功。如有任何建议,我们将不胜感激
plot_tree()
returns 艺术家列表(Annotations). You can access the arrow and change their properties in a loop. Refer to https://matplotlib.org/api/_as_gen/matplotlib.patches.FancyArrowPatch.html#matplotlib.patches.FancyArrowPatch 列表是您可以更改的属性列表。
我不知道为什么箭头没有出现在你的案例中,但我会从调整它们的颜色和宽度开始。
from matplotlib import pyplot as plt
from sklearn.datasets import load_iris
from sklearn import tree
clf = tree.DecisionTreeClassifier(random_state=0)
iris = load_iris()
clf = clf.fit(iris.data, iris.target)
fig, ax = plt.subplots(figsize=(10,10))
out = tree.plot_tree(clf)
for o in out:
arrow = o.arrow_patch
if arrow is not None:
arrow.set_edgecolor('red')
arrow.set_linewidth(3)
我在使用 Jupyter Notebook 时遇到了同样的问题。重新启动内核并再次 运行ning 代码,为我显示了箭头。我认为这是一个编辑器问题。或者,您可以 运行 另一个环境中的代码(例如 VScode)并查看差异。
遇到了同样的问题,而且也没有像您自己提到的那样使用 sns
。但是有人指出这是一个潜在的问题,所以我 从那个 virtualenv 中卸载了 seaborn 并解决了这个问题。
我不知道为什么会这样;也许我导入的库之一正在以某种方式导入 seaborn。
对我来说,我使用的是 matplotlib 的 fivethirtyeight
主题。将其注释掉,然后重新启动内核。这显示了预期的箭头并且非常清晰。
我在使用 seaborn 进行其他绘图时遇到了类似的问题。我用了 sns.reset_defaults()
问题解决了。