可视化相似度矩阵的最简单方法
Easiest way to visualize similarity matrix
我有一个相似度矩阵 sim_matrix
。我还有一个 names
列表,每个 col/row。可视化此矩阵的最简单方法是什么(即低值,例如浅红色和高值深红色),是否用名称注释列和行?
我目前正在使用 xlsx_writer
执行此操作,但我确信使用 matplotlib 更容易。
import xlsxwriter
workbook = xlsxwriter.Workbook('arrays.xlsx')
worksheet = workbook.add_worksheet()
f = lambda x, y: [f"{x} {no+1}" for no in range(y)] + [f"{x} mean"]
names = f("doc", len(urls))[:len(urls)] + f("wiki", len(wiki)) + f("para", len(paragraphs)) + f("topic", len(topics))
# Write name
for i, name in enumerate(names):
cell = i + 1
worksheet.write(cell, 0, name)
worksheet.write(0, cell, name)
# Write similarity values
for i, row in enumerate(sim_matrix):
for j, element in enumerate(row):
if j < i:
# Here I would have to differentiate and color accordingly. But that's really annoying
cell_format = workbook.add_format().set_bg_color("blue")
worksheet.write(i + 1, j + 1, element, cell_format)
workbook.close()
由于您已经在使用 XlsxWriter,因此您可以应用 2 或 3“色阶”conditional format。可以更改比例尺中使用的颜色:
worksheet.conditional_format('B3:K12', {'type': '2_color_scale'})
输出:
我有一个相似度矩阵 sim_matrix
。我还有一个 names
列表,每个 col/row。可视化此矩阵的最简单方法是什么(即低值,例如浅红色和高值深红色),是否用名称注释列和行?
我目前正在使用 xlsx_writer
执行此操作,但我确信使用 matplotlib 更容易。
import xlsxwriter
workbook = xlsxwriter.Workbook('arrays.xlsx')
worksheet = workbook.add_worksheet()
f = lambda x, y: [f"{x} {no+1}" for no in range(y)] + [f"{x} mean"]
names = f("doc", len(urls))[:len(urls)] + f("wiki", len(wiki)) + f("para", len(paragraphs)) + f("topic", len(topics))
# Write name
for i, name in enumerate(names):
cell = i + 1
worksheet.write(cell, 0, name)
worksheet.write(0, cell, name)
# Write similarity values
for i, row in enumerate(sim_matrix):
for j, element in enumerate(row):
if j < i:
# Here I would have to differentiate and color accordingly. But that's really annoying
cell_format = workbook.add_format().set_bg_color("blue")
worksheet.write(i + 1, j + 1, element, cell_format)
workbook.close()
由于您已经在使用 XlsxWriter,因此您可以应用 2 或 3“色阶”conditional format。可以更改比例尺中使用的颜色:
worksheet.conditional_format('B3:K12', {'type': '2_color_scale'})
输出: