python pandas 直方图显示 qcut 的合并范围

Question

我使用 qcut 对范围内的数据进行分箱。但我想在 pandas 直方图中显示输出范围数据。那么，我该怎么做？ ps：数据是从 csv 文件中收集的 Link:Csv file link here

我写了下面的代码-

import matplotlib.pyplot as plt
import pandas as pd
from sklearn.metrics import r2_score

dataset = pd.read_csv("datasets.csv")
print(dataset)


qc = pd.qcut(dataset['Active'], q=8, precision=0)
qc_val = qc.value_counts().sort_index()
print(qc_val)

合并范围输出是-

(-1.0, 63.0]          5
(63.0, 212.0]         5
(212.0, 827.0]        4
(827.0, 1465.0]       8
(1465.0, 1959.0]      2
(1959.0, 4545.0]      4
(4545.0, 8594.0]      5
(8594.0, 221447.0]    5
Name: Active, dtype: int64

那么，有什么方法可以根据上述合并范围数据显示直方图吗？

Answer 1

Series的直方图函数中可以直接使用bins参数，如

import pandas as pd

url = 'https://drive.google.com/file/d/1lYZqeYH_AtUAUG5947Bd51JXJBrOP5Lp/view?usp=sharing'
path = 'https://drive.google.com/uc?export=download&id='+url.split('/')[-2]
df = pd.read_csv(path)
df['Active'].hist(bins=8)

或者使用 qcut 中的标签，你可以像这样使用它

levels = [f'Level_{i}' for i in range(8)]
df['Active_bins'] = pd.qcut(df['Active'], q=8, precision=0, labels=levels)
df.head()

# from 
import matplotlib.pyplot as plt

fig,ax = plt.subplots()

hatches = ('\', '//', '..', '**', "!", '$', '^','#')         # fill pattern

for (i, d),hatch in zip(df.groupby('Active_bins'), hatches):
    d['Active'].hist(alpha=0.7, ax=ax, label=i, hatch=hatch)

ax.legend()

python pandas 直方图显示 qcut 的合并范围

python pandas histogram to display binning ranges of qcut

python

csv

range

binning

pandas