散景图按值而不是按索引排序条形图
Bokeh plot sort bar plot by values not by index
我正在从他们的官方网站上学习散景教程:https://hub.mybinder.org/user/bokeh-bokeh-notebooks-tkmnntgc/notebooks/tutorial/00%20-%20Introduction%20and%20Setup.ipynb
并试图从中获得美国城市犯罪率的类似图
2009 年的维基百科数据。但是,我遇到了一些问题。
首先我搜索了相关问题,他们很不符合我的要求
问题。
相关问题:
问题
1. 如何获取竖线顶部的值?
2. 如何获取按 y 轴值而不是 x 标签索引排序的值?
这里给出代码:
import pandas as pd
import requests
from bokeh.io import output_notebook, show
output_notebook()
from bokeh.models import ColumnDataSource, HoverTool
from bokeh.plotting import figure
from bokeh.transform import factor_cmap
url = "https://en.wikipedia.org/wiki/List_of_United_States_cities_by_crime_rate"
response = requests.get(url)
df = pd.read_html(response.content)[1]
df = df.iloc[2:]
df.columns = ['State', 'City', 'Population', 'Total_violent',
'Murder', 'Rape', 'Robbery', 'Assault',
'Total_property', 'Burglary', 'Larceny', 'Motor_theft',
'Arson']
df.index = df.index - 2 # Reset index numbers
df.index = df.City
# rename index
df.index.name = 'index'
# Change data type and sort
df['Murder'] = df['Murder'].apply(pd.to_numeric, errors='coerce')
df = df.sort_values(by='Murder', ascending=True)
# first and last 10
df = pd.concat([df.head(10), df.tail(10)])
df.index = range(20)
# create low_high column
df['low_high'] = ['low']*10 + ['high']*10
# create group of two x-axes
group = df.groupby(by=['low_high', 'City'])
# from group get source
source = ColumnDataSource(group)
# from group get figure
p = figure(plot_width=800, plot_height=300,
title="Murder in US city per 100,000 people in 2009",
x_range=group,
toolbar_location=None,
tools="")
# plot labels
p.xgrid.grid_line_color = None
p.xaxis.axis_label = "Cities"
p.yaxis.axis_label = "Murder"
p.xaxis.major_label_orientation = 1.2
# index_cmap will be used for fill_color
index_cmap = factor_cmap('low_high_City',
palette=['#2b83ba', '#abdda4', '#ffffbf', '#fdae61', '#d7191c'],
factors=df['low_high'].unique(),
end=1)
p.vbar(x='low_high_City',
top='Murder_mean',
width=1,
source=source,
line_color="white",
fill_color=index_cmap,
hover_line_color="darkgrey",
hover_fill_color=index_cmap)
hover_cols = ['Murder','Rape','Robbery','Assault','Burglary','Larceny','Motor_theft','Arson']
for col in hover_cols:
df[col] = df[col].apply(pd.to_numeric, errors='coerce')
tooltips = [(c,"@"+c+"_mean") for c in hover_cols]
tooltips = [("City","@City")] + tooltips
p.add_tools(HoverTool(tooltips=tooltips))
show(p)
轴上的顺序完全由绘图范围内的因素顺序决定:
如果您希望因子以不同的顺序出现在轴上,您必须简单地将此列表排序为您想要的任何顺序,然后重新分配给范围:
p.x_range.factors = sorted_factors
对于您的情况,这是一种方法(我不是 Pandas 专家,可能有更好的方法):
desc = group.describe()
low_cities = desc.Murder['mean']['low'].index
low_sorted = [('low', city) for city in sorted(low_cities, key=lambda x: desc.Murder['mean'][('low',)][x])]
high_cities = desc.Murder['mean']['high'].index
high_sorted = [('high', city) for city in sorted(high_cities, key=lambda x: desc.Murder['mean'][('high',)][x])]
我正在从他们的官方网站上学习散景教程:https://hub.mybinder.org/user/bokeh-bokeh-notebooks-tkmnntgc/notebooks/tutorial/00%20-%20Introduction%20and%20Setup.ipynb 并试图从中获得美国城市犯罪率的类似图 2009 年的维基百科数据。但是,我遇到了一些问题。
首先我搜索了相关问题,他们很不符合我的要求 问题。
相关问题:
问题
1. 如何获取竖线顶部的值?
2. 如何获取按 y 轴值而不是 x 标签索引排序的值?
这里给出代码:
import pandas as pd
import requests
from bokeh.io import output_notebook, show
output_notebook()
from bokeh.models import ColumnDataSource, HoverTool
from bokeh.plotting import figure
from bokeh.transform import factor_cmap
url = "https://en.wikipedia.org/wiki/List_of_United_States_cities_by_crime_rate"
response = requests.get(url)
df = pd.read_html(response.content)[1]
df = df.iloc[2:]
df.columns = ['State', 'City', 'Population', 'Total_violent',
'Murder', 'Rape', 'Robbery', 'Assault',
'Total_property', 'Burglary', 'Larceny', 'Motor_theft',
'Arson']
df.index = df.index - 2 # Reset index numbers
df.index = df.City
# rename index
df.index.name = 'index'
# Change data type and sort
df['Murder'] = df['Murder'].apply(pd.to_numeric, errors='coerce')
df = df.sort_values(by='Murder', ascending=True)
# first and last 10
df = pd.concat([df.head(10), df.tail(10)])
df.index = range(20)
# create low_high column
df['low_high'] = ['low']*10 + ['high']*10
# create group of two x-axes
group = df.groupby(by=['low_high', 'City'])
# from group get source
source = ColumnDataSource(group)
# from group get figure
p = figure(plot_width=800, plot_height=300,
title="Murder in US city per 100,000 people in 2009",
x_range=group,
toolbar_location=None,
tools="")
# plot labels
p.xgrid.grid_line_color = None
p.xaxis.axis_label = "Cities"
p.yaxis.axis_label = "Murder"
p.xaxis.major_label_orientation = 1.2
# index_cmap will be used for fill_color
index_cmap = factor_cmap('low_high_City',
palette=['#2b83ba', '#abdda4', '#ffffbf', '#fdae61', '#d7191c'],
factors=df['low_high'].unique(),
end=1)
p.vbar(x='low_high_City',
top='Murder_mean',
width=1,
source=source,
line_color="white",
fill_color=index_cmap,
hover_line_color="darkgrey",
hover_fill_color=index_cmap)
hover_cols = ['Murder','Rape','Robbery','Assault','Burglary','Larceny','Motor_theft','Arson']
for col in hover_cols:
df[col] = df[col].apply(pd.to_numeric, errors='coerce')
tooltips = [(c,"@"+c+"_mean") for c in hover_cols]
tooltips = [("City","@City")] + tooltips
p.add_tools(HoverTool(tooltips=tooltips))
show(p)
轴上的顺序完全由绘图范围内的因素顺序决定:
如果您希望因子以不同的顺序出现在轴上,您必须简单地将此列表排序为您想要的任何顺序,然后重新分配给范围:
p.x_range.factors = sorted_factors
对于您的情况,这是一种方法(我不是 Pandas 专家,可能有更好的方法):
desc = group.describe()
low_cities = desc.Murder['mean']['low'].index
low_sorted = [('low', city) for city in sorted(low_cities, key=lambda x: desc.Murder['mean'][('low',)][x])]
high_cities = desc.Murder['mean']['high'].index
high_sorted = [('high', city) for city in sorted(high_cities, key=lambda x: desc.Murder['mean'][('high',)][x])]