python pandas 直方图显示 qcut 的合并范围
python pandas histogram to display binning ranges of qcut
我使用 qcut 对范围内的数据进行分箱。但我想在 pandas 直方图中显示输出范围数据。
那么,我该怎么做?
ps:数据是从 csv 文件中收集的 Link:Csv file link here
我写了下面的代码-
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.metrics import r2_score
dataset = pd.read_csv("datasets.csv")
print(dataset)
qc = pd.qcut(dataset['Active'], q=8, precision=0)
qc_val = qc.value_counts().sort_index()
print(qc_val)
合并范围输出是-
(-1.0, 63.0] 5
(63.0, 212.0] 5
(212.0, 827.0] 4
(827.0, 1465.0] 8
(1465.0, 1959.0] 2
(1959.0, 4545.0] 4
(4545.0, 8594.0] 5
(8594.0, 221447.0] 5
Name: Active, dtype: int64
那么,有什么方法可以根据上述合并范围数据显示直方图吗?
Series的直方图函数中可以直接使用bins
参数,如
import pandas as pd
url = 'https://drive.google.com/file/d/1lYZqeYH_AtUAUG5947Bd51JXJBrOP5Lp/view?usp=sharing'
path = 'https://drive.google.com/uc?export=download&id='+url.split('/')[-2]
df = pd.read_csv(path)
df['Active'].hist(bins=8)
或者使用 qcut
中的标签,你可以像这样使用它
levels = [f'Level_{i}' for i in range(8)]
df['Active_bins'] = pd.qcut(df['Active'], q=8, precision=0, labels=levels)
df.head()
# from
import matplotlib.pyplot as plt
fig,ax = plt.subplots()
hatches = ('\', '//', '..', '**', "!", '$', '^','#') # fill pattern
for (i, d),hatch in zip(df.groupby('Active_bins'), hatches):
d['Active'].hist(alpha=0.7, ax=ax, label=i, hatch=hatch)
ax.legend()
我使用 qcut 对范围内的数据进行分箱。但我想在 pandas 直方图中显示输出范围数据。 那么,我该怎么做? ps:数据是从 csv 文件中收集的 Link:Csv file link here
我写了下面的代码-
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.metrics import r2_score
dataset = pd.read_csv("datasets.csv")
print(dataset)
qc = pd.qcut(dataset['Active'], q=8, precision=0)
qc_val = qc.value_counts().sort_index()
print(qc_val)
合并范围输出是-
(-1.0, 63.0] 5
(63.0, 212.0] 5
(212.0, 827.0] 4
(827.0, 1465.0] 8
(1465.0, 1959.0] 2
(1959.0, 4545.0] 4
(4545.0, 8594.0] 5
(8594.0, 221447.0] 5
Name: Active, dtype: int64
那么,有什么方法可以根据上述合并范围数据显示直方图吗?
Series的直方图函数中可以直接使用bins
参数,如
import pandas as pd
url = 'https://drive.google.com/file/d/1lYZqeYH_AtUAUG5947Bd51JXJBrOP5Lp/view?usp=sharing'
path = 'https://drive.google.com/uc?export=download&id='+url.split('/')[-2]
df = pd.read_csv(path)
df['Active'].hist(bins=8)
或者使用 qcut
中的标签,你可以像这样使用它
levels = [f'Level_{i}' for i in range(8)]
df['Active_bins'] = pd.qcut(df['Active'], q=8, precision=0, labels=levels)
df.head()
# from
import matplotlib.pyplot as plt
fig,ax = plt.subplots()
hatches = ('\', '//', '..', '**', "!", '$', '^','#') # fill pattern
for (i, d),hatch in zip(df.groupby('Active_bins'), hatches):
d['Active'].hist(alpha=0.7, ax=ax, label=i, hatch=hatch)
ax.legend()