DICT() 和 MATPLOTLIB?
DICT() and MATPLOTLIB?
我创建了一个字典来匹配 sklearn 中决策树的特征重要性与我的 df 中特征的相应名称。下面是代码:
importances = clf.feature_importances_
feature_names = ['age','BP','chol','maxh',
'oldpeak','slope','vessels',
'sex_0.0','sex_1.0',
'pain_1.0','pain_2.0','pain_3.0','pain_4.0',
'bs_0.0','bs_1.0',
'ecg_0.0','ecg_1.0','ecg_2.0',
'ang_0.0','ang_1.0',
'thal_3.0','thal_6.0','thal_7.0']
CLF_sorted = dict(zip(feature_names, importances))
在输出中我得到了这个:
{'BP': 0.053673644739136502,
'age': 0.014904980747733202,
'ang_0.0': 0.0,
'ang_1.0': 0.0,
'bs_0.0': 0.0,
'bs_1.0': 0.0,
'chol': 0.11125922817930389, ...}
如我所料。我有两个问题要问你:
如何创建一个条形图,其中 x 轴表示 feature_names
,y 轴表示相应的 importances
?
如果可能的话,我怎样才能以降序方式对条形图进行排序?
试试这个:
import pandas as pd
df = pd.DataFrame({'feature': feature_names , 'importance': importances})
df.sort_values('importance', ascending=False).set_index('feature').plot.bar(rot=0)
演示:
d ={'BP': 0.053673644739136502,
'age': 0.014904980747733202,
'ang_0.0': 0.0,
'ang_1.0': 0.0,
'bs_0.0': 0.0,
'bs_1.0': 0.0,
'chol': 0.11125922817930389}
df = pd.DataFrame({'feature': [x for x in d.keys()], 'importance': [x for x in d.values()]})
In [63]: import matplotlib as mpl
In [64]: mpl.style.use('ggplot')
In [65]: df.sort_values('importance', ascending=False).set_index('feature').plot.bar(rot=0)
Out[65]: <matplotlib.axes._subplots.AxesSubplot at 0x8c83748>
我创建了一个字典来匹配 sklearn 中决策树的特征重要性与我的 df 中特征的相应名称。下面是代码:
importances = clf.feature_importances_
feature_names = ['age','BP','chol','maxh',
'oldpeak','slope','vessels',
'sex_0.0','sex_1.0',
'pain_1.0','pain_2.0','pain_3.0','pain_4.0',
'bs_0.0','bs_1.0',
'ecg_0.0','ecg_1.0','ecg_2.0',
'ang_0.0','ang_1.0',
'thal_3.0','thal_6.0','thal_7.0']
CLF_sorted = dict(zip(feature_names, importances))
在输出中我得到了这个:
{'BP': 0.053673644739136502,
'age': 0.014904980747733202,
'ang_0.0': 0.0,
'ang_1.0': 0.0,
'bs_0.0': 0.0,
'bs_1.0': 0.0,
'chol': 0.11125922817930389, ...}
如我所料。我有两个问题要问你:
如何创建一个条形图,其中 x 轴表示
feature_names
,y 轴表示相应的importances
?如果可能的话,我怎样才能以降序方式对条形图进行排序?
试试这个:
import pandas as pd
df = pd.DataFrame({'feature': feature_names , 'importance': importances})
df.sort_values('importance', ascending=False).set_index('feature').plot.bar(rot=0)
演示:
d ={'BP': 0.053673644739136502,
'age': 0.014904980747733202,
'ang_0.0': 0.0,
'ang_1.0': 0.0,
'bs_0.0': 0.0,
'bs_1.0': 0.0,
'chol': 0.11125922817930389}
df = pd.DataFrame({'feature': [x for x in d.keys()], 'importance': [x for x in d.values()]})
In [63]: import matplotlib as mpl
In [64]: mpl.style.use('ggplot')
In [65]: df.sort_values('importance', ascending=False).set_index('feature').plot.bar(rot=0)
Out[65]: <matplotlib.axes._subplots.AxesSubplot at 0x8c83748>