可视化相似度矩阵的最简单方法

Easiest way to visualize similarity matrix

我有一个相似度矩阵 sim_matrix。我还有一个 names 列表,每个 col/row。可视化此矩阵的最简单方法是什么(即低值,例如浅红色和高值深红色),是否用名称注释列和行?

我目前正在使用 xlsx_writer 执行此操作,但我确信使用 matplotlib 更容易。

import xlsxwriter
workbook = xlsxwriter.Workbook('arrays.xlsx')
worksheet = workbook.add_worksheet()
f = lambda x, y: [f"{x} {no+1}" for no in range(y)] + [f"{x} mean"]
names = f("doc", len(urls))[:len(urls)] + f("wiki", len(wiki)) +  f("para", len(paragraphs)) + f("topic", len(topics))
# Write name
for i, name in enumerate(names):
    cell = i + 1
    worksheet.write(cell, 0, name)
    worksheet.write(0, cell, name)
# Write similarity values
for i, row in enumerate(sim_matrix):
    for j, element in enumerate(row):
        if j < i:
            # Here I would have to differentiate and color accordingly. But that's really annoying
            cell_format = workbook.add_format().set_bg_color("blue")
            worksheet.write(i + 1, j + 1, element, cell_format)
workbook.close()

由于您已经在使用 XlsxWriter,因此您可以应用 2 或 3“色阶”conditional format。可以更改比例尺中使用的颜色:

worksheet.conditional_format('B3:K12', {'type': '2_color_scale'})

输出: