如何更新我的散景图例以反映 Pandas 数据框中的分类变量

How to update my Bokeh Legend to reflect Categorical Variable in Pandas Dataframe

我正在尝试使用 Bokeh 制作一个下拉菜单,突出显示我发现的聚类中的点。我可以使用下拉菜单,但现在我希望能够通过颜色可视化另一个分类变量:Noun Class,具有 Masc、Fem 和 Neuter 级别。问题是当我切换我正在可视化的集群时,图例不会更新。此外,如果我想象的第一个集群中没有所有 3 个名词 class,代码将开始处理我试图(错误地)认为具有第一个集群的名词 class 的所有其他集群].例如,如果 Cluster 0 是默认值并且只有 Masc 点,那么我使用下拉菜单查看的所有其他集群都被视为只有 Masc 点,即使它们在实际 DF 中有 Fem 或 Neuter。

我的主要问题是:如何更新图例,使其只关注 'Curr'

的相应名词 classes

这是一些可重现的代码:

import pandas as pd
from bokeh.io import output_file, show, output_notebook, save, push_notebook
from bokeh.models import ColumnDataSource, Select, DateRangeSlider, CustomJS
from bokeh.plotting import figure, Figure, show
from bokeh.models import CustomJS 
from bokeh.layouts import row,column,layout
import random
import numpy as np
from bokeh.transform import factor_cmap
from bokeh.palettes import Colorblind
import bokeh.io
from bokeh.resources import INLINE

#Generate reproducible DF
noun_class_names = ["Masc","Fem","Neuter"]

x = [random.randint(0,50) for i in range(100)]
y = [random.randint(0,50) for i in range(100)]

rand_clusters = [str(random.randint(0,10)) for i in range(100)]
noun_classes = [random.choice(noun_class_names) for i in range(100)]
df = pd.DataFrame({'x_coord':x, 'y_coord':y,'noun class':noun_classes,'cluster labels':rand_clusters})

df.loc[df['cluster labels'] == '0', 'noun class'] = 'Masc' #ensure that cluster 0 has all same noun class to illustrate error


clusters = [str(i) for i in range(len(df['cluster labels'].unique()))]

cols1 = df#[['cluster labels','x_coord', 'y_coord']]
cols2 = cols1[cols1['cluster labels'] == '0']

Overall = ColumnDataSource(data=cols1)
Curr = ColumnDataSource(data=cols2)


#plot and the menu is linked with each other by this callback function
callback = CustomJS(args=dict(source=Overall, sc=Curr), code="""
var f = cb_obj.value
sc.data['x_coord']=[]
sc.data['y_coord']=[]
for(var i = 0; i <= source.get_length(); i++){
    if (source.data['cluster labels'][i] == f){
        sc.data['x_coord'].push(source.data['x_coord'][i])
        sc.data['y_coord'].push(source.data['y_coord'][i])
        sc.data['noun class'].push(source.data['noun class'][i])
        sc.data['cluster labels'].push(source.data['cluster labels'][i])
    }
}   
sc.change.emit();
""")

menu = Select(options=clusters, value='0', title = 'Cluster #')  # create drop down menu

bokeh_p=figure(x_axis_label ='X Coord', y_axis_label = 'Y Coord', y_axis_type="linear",x_axis_type="linear") #creating figure object 

mapper = factor_cmap(field_name = "noun class", palette = Colorblind[6], factors = df['noun class'].unique()) #color mapper for noun classes

bokeh_p.circle(x='x_coord', y='y_coord', color='gray', alpha = .5, source=Overall) #plot all other points in gray
bokeh_p.circle(x='x_coord', y='y_coord', color=mapper, line_width = 1, source=Curr, legend_group = 'noun class') # plotting the desired cluster using glyph circle and colormapper

bokeh_p.legend.title = "Noun Classes"


menu.js_on_change('value', callback) # calling the function on change of selection
bokeh.io.output_notebook(INLINE)
show(layout(menu,bokeh_p), notebook_handle=True)

在此先致谢,希望您度过愉快的一天:)

Imma 与你们一起保持真实...代码现在按我想要的方式工作,但我不完全确定我做了什么。我认为我所做的是在 Curr 数据源中重置名词 类,然后在选择新集群以可视化和更新 xy 坐标后更新图例标签字段。如果有人能为了后代的缘故确认或纠正我,我将不胜感激:)

最好!

import pandas as pd
import random
import numpy as np
from bokeh.plotting import figure, Figure, show
from bokeh.io import output_notebook, push_notebook, show, output_file, save
from bokeh.transform import factor_cmap
from bokeh.palettes import Colorblind
from bokeh.layouts import layout, gridplot, column, row
from bokeh.models import ColumnDataSource, Slider, CustomJS, Select, DateRangeSlider, Legend, LegendItem
import bokeh.io
from bokeh.resources import INLINE

#Generate reproducible DF
noun_class_names = ["Masc","Fem","Neuter"]

x = [random.randint(0,50) for i in range(100)]
y = [random.randint(0,50) for i in range(100)]

rand_clusters = [str(random.randint(0,10)) for i in range(100)]
noun_classes = [random.choice(noun_class_names) for i in range(100)]
df = pd.DataFrame({'x_coord':x, 'y_coord':y,'noun class':noun_classes,'cluster labels':rand_clusters})

df.loc[df['cluster labels'] == '0', 'noun class'] = 'Masc' #ensure that cluster 0 has all same noun class to illustrate error

clusters = [str(i) for i in range(len(df['cluster labels'].unique()))]

cols1 = df#[['cluster labels','x_coord', 'y_coord']]
cols2 = cols1[cols1['cluster labels'] == '0']

Overall = ColumnDataSource(data=cols1)
Curr = ColumnDataSource(data=cols2)



#plot and the menu is linked with each other by this callback function
callback = CustomJS(args=dict(source=Overall, sc=Curr), code="""
var f = cb_obj.value
sc.data['x_coord']=[]
sc.data['y_coord']=[]
sc.data['noun class'] =[]
for(var i = 0; i <= source.get_length(); i++){
    if (source.data['cluster labels'][i] == f){
        sc.data['x_coord'].push(source.data['x_coord'][i])
        sc.data['y_coord'].push(source.data['y_coord'][i])
        sc.data['noun class'].push(source.data['noun class'][i])
        sc.data['cluster labels'].push(source.data['cluster labels'][i])
    }
}

sc.change.emit();
bokeh_p.legend.label.field = sc.data['noun class'];



""")

menu = Select(options=clusters, value='0', title = 'Cluster #')  # create drop down menu

bokeh_p=figure(x_axis_label ='X Coord', y_axis_label = 'Y Coord', y_axis_type="linear",x_axis_type="linear") #creating figure object 

mapper = factor_cmap(field_name = "noun class", palette = Colorblind[6], factors = df['noun class'].unique()) #color mapper- sorry this was a thing that carried over from og code (fixed now)

bokeh_p.circle(x='x_coord', y='y_coord', color='gray', alpha = .05, source=Overall)

bokeh_p.circle(x = 'x_coord', y = 'y_coord', fill_color = mapper, line_color = mapper, source = Curr, legend_field = 'noun class')


bokeh_p.legend.title = "Noun Classes"



menu.js_on_change('value', callback) # calling the function on change of selection
bokeh.io.output_notebook(INLINE)
show(layout(menu,bokeh_p), notebook_handle=True)