从 scipy 树状图中检索离开颜色

retrieve leave colors from scipy dendrogram

我无法从 scipy dendrogram dictionary. As stated in the documentation and in this github issue 中获取彩色叶子,树状图字典中的 color_list 键指的是链接,而不是叶子。如果有另一个指向叶子的键会很好,有时您需要它来为其他类型的图形着色,例如下面示例中的散点图。

import numpy as np
import matplotlib.pyplot as plt
from scipy.cluster.hierarchy import linkage, dendrogram

# DATA EXAMPLE
x = np.array([[ 5, 3],
              [10,15],
              [15,12],
              [24,10],
              [30,30],
              [85,70],
              [71,80]])

# DENDROGRAM
plt.figure()
plt.subplot(121)
z = linkage(x, 'single')
d = dendrogram(z)

# COLORED PLOT
# This is what I would like to achieve. Colors are assigned manually by looking
# at the dendrogram, because I failed to get it from d['color_list'] (it refers 
# to links, not observations)
plt.subplot(122)
points = d['leaves']
colors = ['r','r','g','g','g','g','g']
for point, color in zip(points, colors):
    plt.plot(x[point, 0], x[point, 1], 'o', color=color)

在此示例中,手动分配颜色似乎很容易,但我正在处理庞大的数据集,因此在我们在字典中获得此新功能(色叶)之前,我会尝试以某种方式根据当前信息推断它包含在字典中,但到目前为止我没有想法。谁能帮帮我?

谢谢。

以下方法似乎可行。树状图返回的字典包含 'color_list' 和链接的颜色。 'icoord' 和 'dcoord' 以及 x,分别是。 y,绘制这些链接的坐标。这些 x 位置是 5, 15, 25, ... 当链接从一个点开始时。因此,测试这些 x 位置可以将我们从联动带回到相应的点。并允许将链接的颜色分配给点。

import numpy as np
import matplotlib.pyplot as plt
from scipy.cluster.hierarchy import linkage, dendrogram

# DATA EXAMPLE
x = np.random.uniform(0, 10, (20, 2))

# DENDROGRAM
plt.figure()
plt.subplot(121)
z = linkage(x, 'single')
d = dendrogram(z)
plt.yticks([])

# COLORED PLOT
plt.subplot(122)
points = d['leaves']
colors = ['none'] * len(points)
for xs, c in zip(d['icoord'], d['color_list']):
    for xi in xs:
        if xi % 10 == 5:
            colors[(int(xi)-5) // 10] = c
for point, color in zip(points, colors):
    plt.plot(x[point, 0], x[point, 1], 'o', color=color)
    plt.text(x[point, 0], x[point, 1], f' {point}')
plt.show()

PS:This post关于匹配点与它们的簇也可能是相关的。

对于scipy 1.7.1,新功能已经实现,输出字典中的树状图函数returns也是一个条目'leaves_color_list',可用于轻松执行此任务.

这是 OP 的工作代码(见最后一行“新代码”)

import numpy as np
import matplotlib.pyplot as plt
from scipy.cluster.hierarchy import linkage, dendrogram

# DATA EXAMPLE
x = np.array([[ 5, 3],
              [10,15],
              [15,12],
              [24,10],
              [30,30],
              [85,70],
              [71,80]])

# DENDROGRAM
plt.figure()
plt.subplot(121)
z = linkage(x, 'single')
d = dendrogram(z)

# COLORED PLOT
# This is what I would like to achieve. Colors are assigned manually by looking
# at the dendrogram, because I failed to get it from d['color_list'] (it refers 
# to links, not observations)
plt.subplot(122)

#NEW CODE
plt.scatter(x[d['leaves'],0],x[d['leaves'],1], color=d['leaves_color_list'])