累积范数直方图
Cumulative Normed Histogram
我写了一个代码,给出了累积范数直方图。
请问如何固定X轴?
此直方图具有应用于第二维的阈值的附加功能,因此可以使用有关列 "B" 以及列 "A".
的信息
它还可以调整计数标准化所依据的数字 "C"。
import pandas as pd
import numpy as np
# Data
df_1 = pd.DataFrame({'A': [1,2,1,2,3,4,2,1,4],
'B': [2,1,2,1,2,3,4,2,1]})
# Cumulative Normed Histogram
bins = np.arange(0, 5, .2)
df_1['A_Bin'] = pd.cut(df_1['A'], bins=bins)
# Apply a threshold to B
df_2 = df_1[df_1['B'] > 2]
# Get the number of rows
C = len(df_1.index)
def fun(g):
try:
return float(g.shape[0]) / C
except ZeroDivisionError:
return np.nan
hist = df_1.groupby('A_Bin').apply(fun)
hist_2 = df_2.groupby('A_Bin').apply(fun)
hist_cum = hist.cumsum()
hist_2_cum = hist_2.cumsum()
hist_cum.plot()
hist_2_cum.plot()
我试过这个:
import matplotlib.pyplot as plt
plt.xticks((0,2,4,6),('0','2','4','6'))
但是得到了这个:
我只需要获取刻度并将其放入这样的 Dataframe 中:
import pandas as pd
import numpy as np
# Data
df_1 = pd.DataFrame({'A': [1,2,1,2,3,4,2,1,4],
'B': [2,1,2,1,2,3,4,2,1]})
# Cumulative Normed Histogram
Min = 0
Max = 6
Step = .5
bins = np.arange(Min, Max, Step)
df_1['A_Bin'] = pd.cut(df_1['A'], bins=bins)
# Apply a threshold to B
df_2 = df_1[df_1['B'] > 2]
# Get the number of rows
C = len(df_1.index)
def fun(g):
try:
return float(g.shape[0]) / C
except ZeroDivisionError:
return np.nan
hist = df_1.groupby('A_Bin').apply(fun)
hist_2 = df_2.groupby('A_Bin').apply(fun)
hist_cum = hist.cumsum()
hist_2_cum = hist_2.cumsum()
# Put the Histogram in a Dataframe
df_hist_cum = hist_cum.to_frame()
df_hist_2_cum = hist_2_cum.to_frame()
# Define the Ticks
ticks = np.arange(Min, (Max-Step), Step)
df_hist_cum['X'] = ticks
df_hist_2_cum['X'] = ticks
df_hist_cum.columns = ['All', 'A']
df_hist_2_cum.columns = ['2', 'A']
ax = df_hist_cum.plot(x='A', y='All')
df_hist_2_cum.plot(x='A', y='2', ax=ax)
我写了一个代码,给出了累积范数直方图。
请问如何固定X轴?
此直方图具有应用于第二维的阈值的附加功能,因此可以使用有关列 "B" 以及列 "A".
的信息它还可以调整计数标准化所依据的数字 "C"。
import pandas as pd
import numpy as np
# Data
df_1 = pd.DataFrame({'A': [1,2,1,2,3,4,2,1,4],
'B': [2,1,2,1,2,3,4,2,1]})
# Cumulative Normed Histogram
bins = np.arange(0, 5, .2)
df_1['A_Bin'] = pd.cut(df_1['A'], bins=bins)
# Apply a threshold to B
df_2 = df_1[df_1['B'] > 2]
# Get the number of rows
C = len(df_1.index)
def fun(g):
try:
return float(g.shape[0]) / C
except ZeroDivisionError:
return np.nan
hist = df_1.groupby('A_Bin').apply(fun)
hist_2 = df_2.groupby('A_Bin').apply(fun)
hist_cum = hist.cumsum()
hist_2_cum = hist_2.cumsum()
hist_cum.plot()
hist_2_cum.plot()
我试过这个:
import matplotlib.pyplot as plt
plt.xticks((0,2,4,6),('0','2','4','6'))
但是得到了这个:
我只需要获取刻度并将其放入这样的 Dataframe 中:
import pandas as pd
import numpy as np
# Data
df_1 = pd.DataFrame({'A': [1,2,1,2,3,4,2,1,4],
'B': [2,1,2,1,2,3,4,2,1]})
# Cumulative Normed Histogram
Min = 0
Max = 6
Step = .5
bins = np.arange(Min, Max, Step)
df_1['A_Bin'] = pd.cut(df_1['A'], bins=bins)
# Apply a threshold to B
df_2 = df_1[df_1['B'] > 2]
# Get the number of rows
C = len(df_1.index)
def fun(g):
try:
return float(g.shape[0]) / C
except ZeroDivisionError:
return np.nan
hist = df_1.groupby('A_Bin').apply(fun)
hist_2 = df_2.groupby('A_Bin').apply(fun)
hist_cum = hist.cumsum()
hist_2_cum = hist_2.cumsum()
# Put the Histogram in a Dataframe
df_hist_cum = hist_cum.to_frame()
df_hist_2_cum = hist_2_cum.to_frame()
# Define the Ticks
ticks = np.arange(Min, (Max-Step), Step)
df_hist_cum['X'] = ticks
df_hist_2_cum['X'] = ticks
df_hist_cum.columns = ['All', 'A']
df_hist_2_cum.columns = ['2', 'A']
ax = df_hist_cum.plot(x='A', y='All')
df_hist_2_cum.plot(x='A', y='2', ax=ax)