在 python matplotlib 中使用 {tuple:float} 格式的字典中的 x 和 y 标签创建热图
Create heatmap in python matplotlib with x and y labels from dict with {tuple:float} format
我有一个字典,它包含 movie-title 对作为键和相似度分数作为值:
{('Source Code ', 'Hobo with a Shotgun '): 1.0, ('Adjustment Bureau, The ', 'Just Go with It '): 1.0, ('Limitless ', 'Arthur '): 1.0, ('Adjustment Bureau, The ', 'Kung Fu Panda 2 '): 1.0, ('Rise of the Planet of the Apes ', 'Scream 4 '): 1.0, ('Source Code ', 'Take Me Home Tonight '): 1.0, ('Midnight in Paris ', 'Take Me Home Tonight '): 1.0, ('Harry Potter and the Deathly Hallows: Part 2 ', 'Pina '): 1.0, ('Avengers, The ', 'Drive Angry '): 1.0, ('Limitless ', 'Super 8 '): 1.0, ('Harry Potter and the Deathly Hallows: Part 2 ', 'Arthur '): 1.0, ('Source Code ', 'Melancholia '): 0.6666666666666666, ('Harry Potter and the Deathly Hallows: Part 2 ', 'Jane Eyre '): 1.0, ('Avengers, The ', 'Arthur '): 0.6666666666666666, ('The Artist ', 'Attack the Block '): 1.0, ('Midnight in Paris ', 'Priest '): 1.0, ('Adjustment Bureau, The ', 'Hanna '): 1.0, ('The Artist ', 'Thor '): 1.0, ('The Artist ', 'Zeitgeist: Moving Forward '): 1.0, ('The Artist ', 'Green Hornet, The '): 1.0, ('X-Men: First Class ', 'Sanctum '): 1.0, ('Source Code ', 'Green Hornet, The '): 1.0, ('Harry Potter and the Deathly Hallows: Part 2 ', 'Something Borrowed '): 1.0, ('Adjustment Bureau, The ', 'Rio '): 1.0, ('Avengers, The ', 'Mechanic, The '): 1.0, ('Rise of the Planet of the Apes ', 'Something Borrowed '): 0.6666666666666666, ('Captain America: The First Avenger ', 'Attack the Block '): 0.6666666666666666, ('Avengers, The ', 'Zeitgeist: Moving Forward '): 1.0, ('Midnight in Paris ', 'Arthur '): 1.0, ('Source Code ', 'Arthur '): 1.0, ('Limitless ', 'Take Me Home Tonight '): 1.0, ('Midnight in Paris ', 'Win Win '): 1.0, ('X-Men: First Class ', 'Something Borrowed '): 1.0, ('Avengers, The ', 'Dilemma, The '): 1.0, ('X-Men: First Class ', 'Green Hornet, The '): 1.0, ('The Artist ', 'Just Go with It '): 1.0, ('Rise of the Planet of the Apes ', 'Arthur '): 1.0, ('Captain America: The First Avenger ', 'Lincoln Lawyer, The '): 1.0, ('X-Men: First Class ', 'Hobo with a Shotgun '): 1.0, ('Limitless ', 'Mechanic, The '): 0.6666666666666666, ('Captain America: The First Avenger ', 'Green Hornet, The '): 1.0, ('Captain America: The First Avenger ', 'Hangover Part II, The '): 1.0, ('X-Men: First Class ', 'Hanna '): 1.0, ('Rise of the Planet of the Apes ', 'Priest '): 1.0, ('Midnight in Paris ', 'I Am Number Four '): 1.0, ('Rise of the Planet of the Apes ', 'Tree of Life, The '): 1.0, ('Captain America: The First Avenger ', 'Hanna '): 1.0, ('Harry Potter and the Deathly Hallows: Part 2 ', 'Win Win '): 1.0, ('Limitless ', 'Drive Angry '): 0.6666666666666666, ('Adjustment Bureau, The ', 'Hangover Part II, The '): 1.0}
我想使用 matplotlib 创建热图,每个键中的第一部电影为 y-labels,每个键中的第二部电影为 x-labels,相似度得分为 z-axis.
到目前为止,我使用以下内容作为指导 (Converting a dictionary of tuples into a numpy matrix),但它似乎没有绘制正确的分布(见图)。
我目前为 z-axis 创建 numpy.ndarrray 的代码如下所示:
import numpy as np
import matplotlib.pyplot as plt
# create a heatmap
sim_scores = np.array(dict_sim_scores.values())
movie_titles = np.array(dict_sim_scores.keys())
## return unique movie titles and indices of the input array
unq_titles, title_idx = np.unique(movie_titles, return_inverse=True)
title_idx = title_idx.reshape(-1,2)
n = len(unq_titles)
sim_matrix = np.zeros((n, n) ,dtype=sim_scores.dtype)
sim_matrix[title_idx[:,0], title_idx[: ,1]] = sim_scores
list_item =[]
list_other_item=[]
for i,key in enumerate(dict_sim_scores):
list_item.append(str(key[0]))
list_other_item.append(str(key[1]))
list_item = np.unique(list_item)
list_other_item = np.unique(list_other_item)
fig = plt.figure('Similarity Scores')
ax = fig.add_subplot(111)
cax = ax.matshow(sim_matrix, interpolation='nearest')
fig.colorbar(cax)
ax.set_xticks(np.arange(len(list_item)))
ax.set_yticks(np.arange(len(list_other_item)))
#ax.set_xticklabels(list_item,rotation=40,fontsize='x-small',ha='right')
ax.xaxis.tick_bottom()
ax.set_xticklabels(list_item,rotation=40,fontsize='x-small')
ax.set_yticklabels(list_other_item,fontsize='x-small')
plt.show()
关于如何创建这种图形有什么想法吗?
我会使用 pandas
to unstack the data, followed by seaborn
来绘制结果。以下是使用您提供的词典的示例:
import pandas as pd
ser = pd.Series(list(dict_sim_scores.values()),
index=pd.MultiIndex.from_tuples(dict_sim_scores.keys()))
df = ser.unstack().fillna(0)
df.shape
# (10, 27)
现在使用seaborn
热图函数绘制结果:
import seaborn as sns
sns.heatmap(df);
我有一个字典,它包含 movie-title 对作为键和相似度分数作为值:
{('Source Code ', 'Hobo with a Shotgun '): 1.0, ('Adjustment Bureau, The ', 'Just Go with It '): 1.0, ('Limitless ', 'Arthur '): 1.0, ('Adjustment Bureau, The ', 'Kung Fu Panda 2 '): 1.0, ('Rise of the Planet of the Apes ', 'Scream 4 '): 1.0, ('Source Code ', 'Take Me Home Tonight '): 1.0, ('Midnight in Paris ', 'Take Me Home Tonight '): 1.0, ('Harry Potter and the Deathly Hallows: Part 2 ', 'Pina '): 1.0, ('Avengers, The ', 'Drive Angry '): 1.0, ('Limitless ', 'Super 8 '): 1.0, ('Harry Potter and the Deathly Hallows: Part 2 ', 'Arthur '): 1.0, ('Source Code ', 'Melancholia '): 0.6666666666666666, ('Harry Potter and the Deathly Hallows: Part 2 ', 'Jane Eyre '): 1.0, ('Avengers, The ', 'Arthur '): 0.6666666666666666, ('The Artist ', 'Attack the Block '): 1.0, ('Midnight in Paris ', 'Priest '): 1.0, ('Adjustment Bureau, The ', 'Hanna '): 1.0, ('The Artist ', 'Thor '): 1.0, ('The Artist ', 'Zeitgeist: Moving Forward '): 1.0, ('The Artist ', 'Green Hornet, The '): 1.0, ('X-Men: First Class ', 'Sanctum '): 1.0, ('Source Code ', 'Green Hornet, The '): 1.0, ('Harry Potter and the Deathly Hallows: Part 2 ', 'Something Borrowed '): 1.0, ('Adjustment Bureau, The ', 'Rio '): 1.0, ('Avengers, The ', 'Mechanic, The '): 1.0, ('Rise of the Planet of the Apes ', 'Something Borrowed '): 0.6666666666666666, ('Captain America: The First Avenger ', 'Attack the Block '): 0.6666666666666666, ('Avengers, The ', 'Zeitgeist: Moving Forward '): 1.0, ('Midnight in Paris ', 'Arthur '): 1.0, ('Source Code ', 'Arthur '): 1.0, ('Limitless ', 'Take Me Home Tonight '): 1.0, ('Midnight in Paris ', 'Win Win '): 1.0, ('X-Men: First Class ', 'Something Borrowed '): 1.0, ('Avengers, The ', 'Dilemma, The '): 1.0, ('X-Men: First Class ', 'Green Hornet, The '): 1.0, ('The Artist ', 'Just Go with It '): 1.0, ('Rise of the Planet of the Apes ', 'Arthur '): 1.0, ('Captain America: The First Avenger ', 'Lincoln Lawyer, The '): 1.0, ('X-Men: First Class ', 'Hobo with a Shotgun '): 1.0, ('Limitless ', 'Mechanic, The '): 0.6666666666666666, ('Captain America: The First Avenger ', 'Green Hornet, The '): 1.0, ('Captain America: The First Avenger ', 'Hangover Part II, The '): 1.0, ('X-Men: First Class ', 'Hanna '): 1.0, ('Rise of the Planet of the Apes ', 'Priest '): 1.0, ('Midnight in Paris ', 'I Am Number Four '): 1.0, ('Rise of the Planet of the Apes ', 'Tree of Life, The '): 1.0, ('Captain America: The First Avenger ', 'Hanna '): 1.0, ('Harry Potter and the Deathly Hallows: Part 2 ', 'Win Win '): 1.0, ('Limitless ', 'Drive Angry '): 0.6666666666666666, ('Adjustment Bureau, The ', 'Hangover Part II, The '): 1.0}
我想使用 matplotlib 创建热图,每个键中的第一部电影为 y-labels,每个键中的第二部电影为 x-labels,相似度得分为 z-axis.
到目前为止,我使用以下内容作为指导 (Converting a dictionary of tuples into a numpy matrix),但它似乎没有绘制正确的分布(见图)。
我目前为 z-axis 创建 numpy.ndarrray 的代码如下所示:
import numpy as np
import matplotlib.pyplot as plt
# create a heatmap
sim_scores = np.array(dict_sim_scores.values())
movie_titles = np.array(dict_sim_scores.keys())
## return unique movie titles and indices of the input array
unq_titles, title_idx = np.unique(movie_titles, return_inverse=True)
title_idx = title_idx.reshape(-1,2)
n = len(unq_titles)
sim_matrix = np.zeros((n, n) ,dtype=sim_scores.dtype)
sim_matrix[title_idx[:,0], title_idx[: ,1]] = sim_scores
list_item =[]
list_other_item=[]
for i,key in enumerate(dict_sim_scores):
list_item.append(str(key[0]))
list_other_item.append(str(key[1]))
list_item = np.unique(list_item)
list_other_item = np.unique(list_other_item)
fig = plt.figure('Similarity Scores')
ax = fig.add_subplot(111)
cax = ax.matshow(sim_matrix, interpolation='nearest')
fig.colorbar(cax)
ax.set_xticks(np.arange(len(list_item)))
ax.set_yticks(np.arange(len(list_other_item)))
#ax.set_xticklabels(list_item,rotation=40,fontsize='x-small',ha='right')
ax.xaxis.tick_bottom()
ax.set_xticklabels(list_item,rotation=40,fontsize='x-small')
ax.set_yticklabels(list_other_item,fontsize='x-small')
plt.show()
关于如何创建这种图形有什么想法吗?
我会使用 pandas
to unstack the data, followed by seaborn
来绘制结果。以下是使用您提供的词典的示例:
import pandas as pd
ser = pd.Series(list(dict_sim_scores.values()),
index=pd.MultiIndex.from_tuples(dict_sim_scores.keys()))
df = ser.unstack().fillna(0)
df.shape
# (10, 27)
现在使用seaborn
热图函数绘制结果:
import seaborn as sns
sns.heatmap(df);