将 Bokeh USCounties 示例数据与我自己的数据框中的列值合并
Merging Bokeh USCounties sample data with column values from my own dataframe
我正在使用 Python 3,目前正在使用最新版本的 Bokeh。
我已经导入了所有必要的东西,但我有点卡在一行(我希望)代码中。
我正在使用美国县样本数据。我想将鼠标悬停在地图上并显示每个相应县的投票百分比,因为它们已用光标悬停在上面。
我在这里搜索了其他散景示例并明确搜索了美国县数据,但我似乎只能找到与地图形状有关的问题。
from bokeh.models import LogColorMapper
from bokeh.palettes import Viridis6 as palette
from bokeh.sampledata.us_counties import data as counties
palette = tuple(reversed(palette))
color_mapper = LogColorMapper(palette=palette)
counties = {
code: county for code, county in counties.items() if county['state'] == 'tx'
}
county_xs = [county['lons'] for county in counties.values()]
county_ys = [county['lats'] for county in counties.values()]
county_names = [county['name'] for county in counties.values()]
## Below is the variable I wish to create, and these are the columns and dataframe of importance.
#county_vote_total =
#texasJbFinal['County Vote Percentage'] - where the vote percentages are
#texasJbFinal['County'] - What my own df county column is labelled as.
data = dict(
x=county_xs,
y=county_ys,
name=county_names,
voteP=county_vote_total
)
TOOLS = "pan,wheel_zoom,reset,hover,save"
p = figure(
title='Joe Biden Texas Vote Percentage',
tools=TOOLS,
x_axis_location=None, y_axis_location=None,
tooltips=[
("Name", "@name"), ("Vote Percentage", "@voteP"), ("Long, lat", "($x, $y)")
]
)
p.grid.grid_line_color=None
p.hover.point_policy = "follow_mouse"
p.patches("x", "y", source=data, fill_color={"field": "voteP", "transform": color_mapper},
fill_alpha=0.6, line_color="black", line_width=0.5)
show(p)
我尝试了一些方法,但我似乎无法弄清楚如何将 texasJbFinal
数据框中的每个县与 bokeh.sampledata.us_counties
匹配,然后显示每个县的投票百分比悬停在上面。
这是我的 DF 示例,使用 texasJbFinal.head(5).to_dict()
{'State': {0: 'Texas', 1: 'Texas', 2: 'Texas', 3: 'Texas', 4: 'Texas'},
'County': {0: 'Roberts County',
1: 'Borden County',
2: 'King County',
3: 'Glasscock County',
4: 'Armstrong County'},
'Candidate': {0: 'Joe Biden',
1: 'Joe Biden',
2: 'Joe Biden',
3: 'Joe Biden',
4: 'Joe Biden'},
'Total Votes': {0: 17, 1: 16, 2: 8, 3: 39, 4: 75},
'County Vote Percentage': {0: 3.091, 1: 3.846, 2: 5.031, 3: 5.972, 4: 6.745},
'Total Population': {0: 912, 1: 697, 2: 315, 3: 2171, 4: 2122},
'White Alone': {0: 782, 1: 598, 2: 234, 3: 1003, 4: 1833},
'White Alone Percent': {0: 85.74561403508771,
1: 85.79626972740316,
2: 74.28571428571428,
3: 46.19990787655458,
4: 86.38077285579642},
'Black or African American Alone': {0: 0, 1: 0, 2: 0, 3: 0, 4: 5},
'Black or African American Alone Percent': {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.23562676720075398},
'American Indian and Alaska Native Alone': {0: 0, 1: 0, 2: 0, 3: 0, 4: 22},
'Asian Alone': {0: 0, 1: 0, 2: 0, 3: 0, 4: 0},
'Native Hawaiian and Other Pacific Islander Alone': {0: 0,
1: 0,
2: 0,
3: 0,
4: 0},
'Some other race Alone': {0: 0, 1: 0, 2: 3, 3: 507, 4: 42},
'Two or more races': {0: 23, 1: 15, 2: 0, 3: 0, 4: 71},
'Hispanic or Latino Alone': {0: 107, 1: 84, 2: 78, 3: 661, 4: 149},
'Hispanic or Latino Alone Percent': {0: 11.732456140350877,
1: 12.051649928263988,
2: 24.76190476190476,
3: 30.446798710271764,
4: 7.021677662582469}}
以下是我的处理方法:
- 将 Bokeh 县数据转换为 DataFrame 以与您现有的 df 合并。类似于:
bokeh_counties = pd.DataFrame.from_records([county for key, county in counties.items()])
...然后您必须进行一些正则表达式匹配或其他文本操作才能合并,因为您的值都附加有“县”,而 Bokeh 数据集中的值则没有。
- 获得包含所需数据的合并 DataFrame 后,convert to a ColumnDataSource 供 Bokeh 字形和悬停工具使用。虽然 CDS 并不是很多 Bokeh 任务所必需的,但它们往往会使事情变得容易得多。
感谢您的帮助。我没有完全走你的路,但它给了我解决问题的灵感。
我把县字典变成了一个数据框,做了一点文本操作,与我原来的 pandas 数据框合并,把它全部变回一个字典,之后一切都变得非常简单。
再次感谢您的精彩回答:)
我正在使用 Python 3,目前正在使用最新版本的 Bokeh。
我已经导入了所有必要的东西,但我有点卡在一行(我希望)代码中。
我正在使用美国县样本数据。我想将鼠标悬停在地图上并显示每个相应县的投票百分比,因为它们已用光标悬停在上面。
我在这里搜索了其他散景示例并明确搜索了美国县数据,但我似乎只能找到与地图形状有关的问题。
from bokeh.models import LogColorMapper
from bokeh.palettes import Viridis6 as palette
from bokeh.sampledata.us_counties import data as counties
palette = tuple(reversed(palette))
color_mapper = LogColorMapper(palette=palette)
counties = {
code: county for code, county in counties.items() if county['state'] == 'tx'
}
county_xs = [county['lons'] for county in counties.values()]
county_ys = [county['lats'] for county in counties.values()]
county_names = [county['name'] for county in counties.values()]
## Below is the variable I wish to create, and these are the columns and dataframe of importance.
#county_vote_total =
#texasJbFinal['County Vote Percentage'] - where the vote percentages are
#texasJbFinal['County'] - What my own df county column is labelled as.
data = dict(
x=county_xs,
y=county_ys,
name=county_names,
voteP=county_vote_total
)
TOOLS = "pan,wheel_zoom,reset,hover,save"
p = figure(
title='Joe Biden Texas Vote Percentage',
tools=TOOLS,
x_axis_location=None, y_axis_location=None,
tooltips=[
("Name", "@name"), ("Vote Percentage", "@voteP"), ("Long, lat", "($x, $y)")
]
)
p.grid.grid_line_color=None
p.hover.point_policy = "follow_mouse"
p.patches("x", "y", source=data, fill_color={"field": "voteP", "transform": color_mapper},
fill_alpha=0.6, line_color="black", line_width=0.5)
show(p)
我尝试了一些方法,但我似乎无法弄清楚如何将 texasJbFinal
数据框中的每个县与 bokeh.sampledata.us_counties
匹配,然后显示每个县的投票百分比悬停在上面。
这是我的 DF 示例,使用 texasJbFinal.head(5).to_dict()
{'State': {0: 'Texas', 1: 'Texas', 2: 'Texas', 3: 'Texas', 4: 'Texas'},
'County': {0: 'Roberts County',
1: 'Borden County',
2: 'King County',
3: 'Glasscock County',
4: 'Armstrong County'},
'Candidate': {0: 'Joe Biden',
1: 'Joe Biden',
2: 'Joe Biden',
3: 'Joe Biden',
4: 'Joe Biden'},
'Total Votes': {0: 17, 1: 16, 2: 8, 3: 39, 4: 75},
'County Vote Percentage': {0: 3.091, 1: 3.846, 2: 5.031, 3: 5.972, 4: 6.745},
'Total Population': {0: 912, 1: 697, 2: 315, 3: 2171, 4: 2122},
'White Alone': {0: 782, 1: 598, 2: 234, 3: 1003, 4: 1833},
'White Alone Percent': {0: 85.74561403508771,
1: 85.79626972740316,
2: 74.28571428571428,
3: 46.19990787655458,
4: 86.38077285579642},
'Black or African American Alone': {0: 0, 1: 0, 2: 0, 3: 0, 4: 5},
'Black or African American Alone Percent': {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.23562676720075398},
'American Indian and Alaska Native Alone': {0: 0, 1: 0, 2: 0, 3: 0, 4: 22},
'Asian Alone': {0: 0, 1: 0, 2: 0, 3: 0, 4: 0},
'Native Hawaiian and Other Pacific Islander Alone': {0: 0,
1: 0,
2: 0,
3: 0,
4: 0},
'Some other race Alone': {0: 0, 1: 0, 2: 3, 3: 507, 4: 42},
'Two or more races': {0: 23, 1: 15, 2: 0, 3: 0, 4: 71},
'Hispanic or Latino Alone': {0: 107, 1: 84, 2: 78, 3: 661, 4: 149},
'Hispanic or Latino Alone Percent': {0: 11.732456140350877,
1: 12.051649928263988,
2: 24.76190476190476,
3: 30.446798710271764,
4: 7.021677662582469}}
以下是我的处理方法:
- 将 Bokeh 县数据转换为 DataFrame 以与您现有的 df 合并。类似于:
bokeh_counties = pd.DataFrame.from_records([county for key, county in counties.items()])
...然后您必须进行一些正则表达式匹配或其他文本操作才能合并,因为您的值都附加有“县”,而 Bokeh 数据集中的值则没有。
- 获得包含所需数据的合并 DataFrame 后,convert to a ColumnDataSource 供 Bokeh 字形和悬停工具使用。虽然 CDS 并不是很多 Bokeh 任务所必需的,但它们往往会使事情变得容易得多。
感谢您的帮助。我没有完全走你的路,但它给了我解决问题的灵感。
我把县字典变成了一个数据框,做了一点文本操作,与我原来的 pandas 数据框合并,把它全部变回一个字典,之后一切都变得非常简单。
再次感谢您的精彩回答:)