来自一组列的 Seaborn 热图?

Seaborn heatmap from group of columns?

需要用 seaborn 创建热图,似乎无法实现或完全掌握如何实现。

每个组件(行)都需要出现在热图上。左侧(y 轴)应显示每个组件的 EID。有很多所以如果每 10-20 只标记 1 个,那很好。在 x 轴上应该是 ROTATION1 ROTATION2 ROTATION3 ROTATION4 ROTATION5 代表数据集的 5 列。这里的第 EXTRA 列与热图无关。

热图应表示的值是 ROT STILL FLIP 或 160-180 之间以 2 分隔的任何数字(因此 160 162 164 等)。

所有列 ROTATION1 - ROTATION5 的某些行都是空白的,但组件仍应包含在热图中(并且不显示它们的颜色)。

+--------+-------+-----------+-----------+-----------+-----------+-----------+
| EID    | EXTRA | ROTATION1 | ROTATION2 | ROTATION3 | ROTATION4 | ROTATION5 |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| AB1178 | POS   | FLIP      |           | STILL     | 172       |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| EC8361 | NEG   |           |           |           |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| QS7229 | POS   |           |           | 160       |           | ROT       |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| SE0447 | NEG   | ROT       | STILL     |           |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| YT5489 | NEG   |           |           |           |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| SZ2548 | NEG   | 164       |           |           | FLIP      |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| OT6892 | POS   | FLIP      |           |           |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| PL3811 | POS   |           |           |           | STILL     |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| WQ0893 | POS   |           |           | ROT       |           | 170       |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| TY3551 | NEG   | 160       | FLIP      |           |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| PC6466 | POS   |           | 180       | 176       |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| YH5912 | POS   |           |           |           |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| BK6245 | NEG   |           |           |           | STILL     |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| GQ2081 | POS   |           |           |           | 162       | FLIP      |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| GF8633 | NEG   |           |           |           |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| FJ4895 | NEG   |           | 174       |           | ROT       |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| YD2504 | POS   |           |           |           |           | 162       |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| RF3510 | POS   |           |           |           |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| PN6167 | NEG   |           | 168       | FLIP      |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| RB9747 | POS   | FLIP      |           | STILL     | 178       | STILL     |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| BQ0841 | NEG   |           | ROT       |           |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| HJ5187 | NEG   |           |           |           |           |           |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| BP2359 | POS   | 168       | STILL     |           |           | ROT       |
+--------+-------+-----------+-----------+-----------+-----------+-----------+
| FN6198 | POS   | ROT       |           |           | 172       | FLIP      |
+--------+-------+-----------+-----------+-----------+-----------+-----------+

我尝试过的:

df = pd.read_csv('DATA.csv')
df = pd.DataFrame(df, columns = ['EID', 'ROTATION1','ROTATION2', 'ROTATION3', 'ROTATION4', 'ROTATION5'])

in_range = list(range(160,181, 2))
direction = ['ROT', 'FLIP', 'STILL']
elements = direction + ([str(num) for num in num_range])

sensing = sns.load_dataset("df")
sensing = sensing.pivot("EID", ['EID', 'ROTATION1','ROTATION2', 'ROTATION3', 'ROTATION4', 'ROTATION5'], elements)

heatmap = sns.heatmap(sensing)

这不起作用,因为我认为“x 轴”元素应该是一列的形式,而不是多行?如果有人能告诉我如何绕行那就太好了!

想要的结果:

右侧带有“颜色图例条”的热图具有 ROT STILL FLIP 和 160-180 之间的数字以 2 分隔。如果有的话,按此顺序可能的。 如前所述,左侧的 y 轴应该有 EID,但实际数据集大约有 200 行,因此每 10 或 20 行代表一个就可以了。 热图中应该有 5 列,每列代表 ROTATION1ROTATION5

我没有经验,只需要一点帮助。

使用 Python2.7 和 PANDAS 0.24.2 和 seaborn 0.9.1

首先,您需要将数据中的所有值都转换为数字类型,int 例如:

replacements = {np.nan: 157, 'FLIP': 182, 'STILL': 184, 'ROT': 187}
inv_replacements = {value: key for key, value in replacements.items()}

df = pd.read_csv(r'data/data.csv')
df = df.drop('EXTRA', axis = 1).set_index('EID')
annot = df.values

df = df.replace(replacements).astype(int)
        ROTATION1  ROTATION2  ROTATION3  ROTATION4  ROTATION5
EID                                                          
AB1178        182        157        184        172        157
EC8361        157        157        157        157        157
QS7229        157        157        160        157        187
SE0447        187        184        157        157        157
YT5489        157        157        157        157        157
SZ2548        164        157        157        182        157
OT6892        182        157        157        157        157
PL3811        157        157        157        184        157
WQ0893        157        157        187        157        170
TY3551        160        182        157        157        157
PC6466        157        180        176        157        157
YH5912        157        157        157        157        157
BK6245        157        157        157        184        157
GQ2081        157        157        157        162        182
GF8633        157        157        157        157        157
FJ4895        157        174        157        187        157
YD2504        157        157        157        157        162
RF3510        157        157        157        157        157
PN6167        157        168        182        157        157
RB9747        182        157        184        178        184
BQ0841        157        187        157        157        157
HJ5187        157        157        157        157        157
BP2359        168        184        157        157        187
FN6198        187        157        157        172        182

然后你应该将每个数值映射到相应的标签并准备一个颜色图:

values = list(replacements.values())
values.extend(list(range(160, 181, 2)))
values = sorted(values)
vmap = {value: str(value) if value not in inv_replacements.keys() else inv_replacements[value] for value in values}
n = len(vmap)
cmap = sns.color_palette('tab20', n)
cmap[0] = (1, 1, 1, 1)

我选择了 'tab20' 颜色图,因为您需要 15 种不同的颜色,而这个颜色图 one of the few 包含了足够的颜色。
然后就可以绘制热图了:

ax = sns.heatmap(df, cmap = cmap, annot = annot, fmt = '')

最后你需要调整颜色图:

colorbar = ax.collections[0].colorbar
r = colorbar.vmax - colorbar.vmin
colorbar.set_ticks([colorbar.vmin + 0.5*r/(n) + r*i/(n) for i in range(n)])
colorbar.set_ticklabels(list(vmap.values()))

完整代码

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np


replacements = {np.nan: 157, 'FLIP': 182, 'STILL': 184, 'ROT': 187}
inv_replacements = {value: key for key, value in replacements.items()}

df = pd.read_csv(r'data/data.csv')
df = df.drop('EXTRA', axis = 1).set_index('EID')
annot = df.values

df = df.replace(replacements).astype(int)


values = list(replacements.values())
values.extend(list(range(160, 181, 2)))
values = sorted(values)
vmap = {value: str(value) if value not in inv_replacements.keys() else inv_replacements[value] for value in values}
n = len(vmap)
cmap = sns.color_palette('tab20', n)
cmap[0] = (1, 1, 1, 1)


ax = sns.heatmap(df, cmap = cmap, annot = annot, fmt = '')

colorbar = ax.collections[0].colorbar
r = colorbar.vmax - colorbar.vmin
colorbar.set_ticks([colorbar.vmin + 0.5*r/(n) + r*i/(n) for i in range(n)])
colorbar.set_ticklabels(list(vmap.values()))
 
plt.show()
  • 带有用于检查的注释的热图:

  • 没有注释的热图:


我不推荐使用连续色图:可能很难将一个值与下一个值区分开来。
但是,如果需要,您可以对所有值或仅对数字值使用连续的颜色图。
(当然你可以保留或删除注释)

  • colormap 'plasma' 仅用于数值,白色用于 nans,RGB 用于分类:

    cmap = sns.color_palette('plasma', n - 4)
    cmap.insert(0, (1, 1, 1, 1))
    cmap.append((1, 0, 0, 1))
    cmap.append((0, 1, 0, 1))
    cmap.append((0, 0, 1, 1))
    

  • colormap 'plasma' 所有值:

    cmap = sns.color_palette('plasma', n - 1)
    cmap.insert(0, (1, 1, 1, 1))