聚类热图(带树状图)/Python

Plotly clustered heatmap (with dendrogram)/Python

我正在尝试使用 Python 中的 plotly 创建聚类热图(带有树状图)。他们在他们网站上做的那个不能很好地扩展,我已经找到了各种解决方案,但大多数都是在 R 或 JavaScript 中。我正在尝试仅从热图的左侧创建一个带有树状图的热图,显示 y 轴上的聚类(来自层次聚类)。一个非常好看的例子是这个:https://chart-studio.plotly.com/~jackp/6748。我的目的是创建类似这样的东西,但仅限于左侧树状图。如果有人能在Python中实现这样的东西,我将不胜感激!

设数据为X = np.random.randint(0, 10, size=(120, 10))

以下建议借鉴了 Dendrograms in Python and chart-studio.plotly.com/~jackp 中的元素。此特定图使用您的数据 X = np.random.randint(0, 10, size=(120, 10))。在我看来,链接方法的一个共同点是数据集和数据处理过程有点混乱。所以我决定在 pandas 数据帧上使用 df = pd.DataFrame(X) 构建下图,希望能让一切更清晰一些

情节

完整代码

import plotly.graph_objects as go
import plotly.figure_factory as ff

import numpy as np
import pandas as pd
from scipy.spatial.distance import pdist, squareform
import random
import string

X = np.random.randint(0, 10, size=(120, 10))
df = pd.DataFrame(X)

# Initialize figure by creating upper dendrogram
fig = ff.create_dendrogram(df.values, orientation='bottom')
fig.for_each_trace(lambda trace: trace.update(visible=False))

for i in range(len(fig['data'])):
    fig['data'][i]['yaxis'] = 'y2'

# Create Side Dendrogram
# dendro_side = ff.create_dendrogram(X, orientation='right', labels = labels)
dendro_side = ff.create_dendrogram(X, orientation='right')
for i in range(len(dendro_side['data'])):
    dendro_side['data'][i]['xaxis'] = 'x2'

# Add Side Dendrogram Data to Figure
for data in dendro_side['data']:
    fig.add_trace(data)

# Create Heatmap
dendro_leaves = dendro_side['layout']['yaxis']['ticktext']
dendro_leaves = list(map(int, dendro_leaves))
data_dist = pdist(df.values)
heat_data = squareform(data_dist)
heat_data = heat_data[dendro_leaves,:]
heat_data = heat_data[:,dendro_leaves]

heatmap = [
    go.Heatmap(
        x = dendro_leaves,
        y = dendro_leaves,
        z = heat_data,
        colorscale = 'Blues'
    )
]

heatmap[0]['x'] = fig['layout']['xaxis']['tickvals']
heatmap[0]['y'] = dendro_side['layout']['yaxis']['tickvals']

# Add Heatmap Data to Figure
for data in heatmap:
    fig.add_trace(data)

# Edit Layout
fig.update_layout({'width':800, 'height':800,
                         'showlegend':False, 'hovermode': 'closest',
                         })
# Edit xaxis
fig.update_layout(xaxis={'domain': [.15, 1],
                                  'mirror': False,
                                  'showgrid': False,
                                  'showline': False,
                                  'zeroline': False,
                                  'ticks':""})
# Edit xaxis2
fig.update_layout(xaxis2={'domain': [0, .15],
                                   'mirror': False,
                                   'showgrid': False,
                                   'showline': False,
                                   'zeroline': False,
                                   'showticklabels': False,
                                   'ticks':""})

# Edit yaxis
fig.update_layout(yaxis={'domain': [0, 1],
                                  'mirror': False,
                                  'showgrid': False,
                                  'showline': False,
                                  'zeroline': False,
                                  'showticklabels': False,
                                  'ticks': ""
                        })
# # Edit yaxis2
fig.update_layout(yaxis2={'domain':[.825, .975],
                                   'mirror': False,
                                   'showgrid': False,
                                   'showline': False,
                                   'zeroline': False,
                                   'showticklabels': False,
                                   'ticks':""})

fig.update_layout(paper_bgcolor="rgba(0,0,0,0)",
                  plot_bgcolor="rgba(0,0,0,0)",
                  xaxis_tickfont = dict(color = 'rgba(0,0,0,0)'))

fig.show()
  1. 解决这个问题最简单的方法是使用 dash_bio 包中的 dash_bio.Clustergram 函数。
import pandas as pd
import dash_bio as dashbio

X = np.random.randint(0, 10, size=(120, 10))

dashbio.Clustergram(
    data=X,
    # row_labels=rows,
    # column_labels=columns,
    cluster='row',
    color_threshold={
        'row': 250,
        'col': 700
    },
    height=800,
    width=700,
    color_map= [
        [0.0, '#636EFA'],
        [0.25, '#AB63FA'],
        [0.5, '#FFFFFF'],
        [0.75, '#E763FA'],
        [1.0, '#EF553B']
    ]
)

  1. 一个更费力的解决方案是使用绘图函数 plotly.figure_factory.create_dendrogram 结合 plotly.graph_objects.Heatmap 作为 in plotly document 该示例不是树状图热图,而是成对的距离热图,不过您可以使用这两个函数来创建树状图热图。

也可以使用seabornes clustermap https://seaborn.pydata.org/generated/seaborn.clustermap.html