python pandas 直方图显示 qcut 的合并范围

python pandas histogram to display binning ranges of qcut

我使用 qcut 对范围内的数据进行分箱。但我想在 pandas 直方图中显示输出范围数据。 那么,我该怎么做? ps:数据是从 csv 文件中收集的 Link:Csv file link here

我写了下面的代码-

import matplotlib.pyplot as plt
import pandas as pd
from sklearn.metrics import r2_score

dataset = pd.read_csv("datasets.csv")
print(dataset)


qc = pd.qcut(dataset['Active'], q=8, precision=0)
qc_val = qc.value_counts().sort_index()
print(qc_val)

合并范围输出是-

(-1.0, 63.0]          5
(63.0, 212.0]         5
(212.0, 827.0]        4
(827.0, 1465.0]       8
(1465.0, 1959.0]      2
(1959.0, 4545.0]      4
(4545.0, 8594.0]      5
(8594.0, 221447.0]    5
Name: Active, dtype: int64

那么,有什么方法可以根据上述合并范围数据显示直方图吗?

Series的直方图函数中可以直接使用bins参数,如

import pandas as pd

url = 'https://drive.google.com/file/d/1lYZqeYH_AtUAUG5947Bd51JXJBrOP5Lp/view?usp=sharing'
path = 'https://drive.google.com/uc?export=download&id='+url.split('/')[-2]
df = pd.read_csv(path)
df['Active'].hist(bins=8)

或者使用 qcut 中的标签,你可以像这样使用它

levels = [f'Level_{i}' for i in range(8)]
df['Active_bins'] = pd.qcut(df['Active'], q=8, precision=0, labels=levels)
df.head()

# from 
import matplotlib.pyplot as plt

fig,ax = plt.subplots()

hatches = ('\', '//', '..', '**', "!", '$', '^','#')         # fill pattern

for (i, d),hatch in zip(df.groupby('Active_bins'), hatches):
    d['Active'].hist(alpha=0.7, ax=ax, label=i, hatch=hatch)

ax.legend()