从列表字典创建直方图

Create histogram from dict of lists

我正在尝试绘制字典中某些键(标题为 'hat1' 到 'hat10')出现的数字 1、2 和 3 的频率,但在转换我的数据时遇到问题(如下所示)转换成我可以绘制的格式。

data = {'hat9': [[1, 2, 3, 1, 2]], 'hat8': [[1, 2, 3, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]], 'hat1': [[1, 2, 3]], 'hat3': [[1, 2, 3, 1, 2, 2, 2, 1, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1]], 'hat2': [[1, 2, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]], 'hat5': [[1, 2, 3, 2, 3, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 3, 3, 3, 3, 3, 3, 1, 3, 2, 3, 2, 3, 2, 3, 3, 3, 3, 2, 3, 1, 3, 3, 3, 3]], 'hat4': [[1, 2, 3, 1, 2, 1, 1, 1, 2, 1, 1, 1, 1, 3, 1, 1, 1, 2, 1, 1, 2, 1, 1, 2, 3, 1, 2, 1, 3, 2, 1, 3, 1, 1, 1, 1, 1, 1, 3, 1]], 'hat7': [[1, 2, 3, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2]], 'hat6': [[1, 2, 3, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 3, 3, 3, 3, 1, 1, 3]], 'hat10': [[1, 2, 3, 3, 3, 3, 3, 3, 1, 2, 2, 1, 2, 3, 3, 2, 3, 3, 3, 3, 3, 2, 1, 1, 3, 3, 1, 2, 2, 3, 3, 1, 3, 3, 3, 3, 3, 2, 3, 1, 3, 1, 3, 1, 3, 3, 3, 3, 3, 3, 3, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 3, 3, 3, 3, 2, 1, 3, 2, 1, 3, 2, 3, 3, 1, 2, 1, 2, 3, 3, 1, 3, 2, 2, 1, 2, 3, 3, 1, 2, 3, 2, 3, 3, 1, 3, 3, 3, 3]]}

当我 运行 DataFrame.from_dict(data) 我收到如下输出:

In [100]: DataFrame.from_dict(data)
Out[100]: 
        hat1                                              hat10  \
0  [1, 2, 3]  [1, 2, 3, 3, 3, 3, 3, 3, 1, 2, 2, 1, 2, 3, 3, ...   

                                                hat2  \
0  [1, 2, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...   

                                                hat3  \
0  [1, 2, 3, 1, 2, 2, 2, 1, 2, 2, 2, 2, 1, 1, 1, ...   

                                                hat4  \
0  [1, 2, 3, 1, 2, 1, 1, 1, 2, 1, 1, 1, 1, 3, 1, ...   

                                                hat5  \
0  [1, 2, 3, 2, 3, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, ...   

                                                hat6  \
0  [1, 2, 3, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ...   

                                            hat7  \
0  [1, 2, 3, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2]   

                                                hat8             hat9  
0  [1, 2, 3, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ...  [1, 2, 3, 1, 2]  

我希望有人可以帮助我将数据转换成更可行的格式,可以相对容易地转换成图表。感谢您的帮助。

试试这个:

data = {'hat9': [[1, 2, 3, 1, 2]], 'hat8': [[1, 2, 3, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]], 'hat1': [[1, 2, 3]], 'hat3': [[1, 2, 3, 1, 2, 2, 2, 1, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1]], 'hat2': [[1, 2, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]], 'hat5': [[1, 2, 3, 2, 3, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 3, 3, 3, 3, 3, 3, 1, 3, 2, 3, 2, 3, 2, 3, 3, 3, 3, 2, 3, 1, 3, 3, 3, 3]], 'hat4': [[1, 2, 3, 1, 2, 1, 1, 1, 2, 1, 1, 1, 1, 3, 1, 1, 1, 2, 1, 1, 2, 1, 1, 2, 3, 1, 2, 1, 3, 2, 1, 3, 1, 1, 1, 1, 1, 1, 3, 1]], 'hat7': [[1, 2, 3, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2]], 'hat6': [[1, 2, 3, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 3, 3, 3, 3, 1, 1, 3]], 'hat10': [[1, 2, 3, 3, 3, 3, 3, 3, 1, 2, 2, 1, 2, 3, 3, 2, 3, 3, 3, 3, 3, 2, 1, 1, 3, 3, 1, 2, 2, 3, 3, 1, 3, 3, 3, 3, 3, 2, 3, 1, 3, 1, 3, 1, 3, 3, 3, 3, 3, 3, 3, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 3, 3, 3, 3, 2, 1, 3, 2, 1, 3, 2, 3, 3, 1, 2, 1, 2, 3, 3, 1, 3, 2, 2, 1, 2, 3, 3, 1, 2, 3, 2, 3, 3, 1, 3, 3, 3, 3]]}


keys = []
values = []
for key,value in data.iteritems():
    keys.append(key)
    a = 0
    b = 0
    c = 0
    for x in value[0]:
        if x==1: a+=1;
        elif x ==2: b+=1;
        elif x==3: c+=1;
    values.append([a,b,c])

print keys
print values

希望对您有所帮助。键是 ['hat9', 'hat8', etc.,..]values = [[freq of 1 in 'hats9', freq of 2 in 'hats9', freq of 3 in 'hats9'], [freq of 1 in 'hats8', freq of 2 in 'hats8', freq of 3 in 'hats8'],..](3 个项目列表的列表)

如果你想用 Matplotlib 创建直方图,你真的不需要做更多的事情,只需调用它的 hist 方法来显示每个 hat 。例如,

import pylab
pylab.hist(data['hat4'][0], bins=(1,2,3,4), align='left')

(您需要在 [0] 处建立索引,因为出于某种原因,您的每个字典值都是一个长度为 1 的列表,单个项目本身就是一个数据值列表)。

如果您需要以某种方式聚合帽子,您需要说明如何。

如果您愿意,您可以对 pandas DataFrame 执行相同的操作:

import pandas as pd
df = pd.DataFrame(data)
pylab.hist(df['hat4'], bins=(1,2,3,4), align='left')