无法生成词云 - Python
Cannot Generate Word Cloud - Python
我正在尝试使用 Pandas 列中的单词频率创建词云。
我有一个这样的数据框:
PageNumber Top_words_only
1 people trees like instagram ...
2 people yellow like flickrioapp people level water...
...
78 teatree instagram water leith circuits...
我已经计算了 top_words_only
列中单词的频率并将其放入一个元组中,以便 wordcloud 可以像这样将数据处理成可视化:
tuples = tuple([tuple(x) for x in df.top_words_only.str.split(expand=True).stack().value_counts().reset_index().values])
print(tuples)
<OUT>
(('instagram', 3), ('plant', 3), ('shadow', 3), ('rise', 3), .... ('hibs', 1), ('bud', 1), ('insect', 1),
('warriston', 1), ('garage', 1))
wordcloud = WordCloud()
wordcloud.generate_from_frequencies(tuples)
plt.figure()
plt.imshow(wordcloud, interpolation="bilinear")
plt.axis("off")
plt.show()
但是,它出现了一个属性错误:
AttributeError: 'tuple' object has no attribute 'items'
有人知道我的代码有什么问题吗?
使用字典:
d = dict([tuple(x) for x in df.Top_words_only.str.split(expand=True).stack().value_counts().reset_index().values])
from wordcloud import WordCloud
wordcloud = WordCloud()
wordcloud.generate_from_frequencies(d)
plt.figure()
plt.imshow(wordcloud, interpolation="bilinear")
plt.axis("off")
plt.show()
输出:
生成字典的替代方法:
from collections import Counter
d = Counter(w for x in df['Top_words_only'] for w in x.split())
我正在尝试使用 Pandas 列中的单词频率创建词云。 我有一个这样的数据框:
PageNumber Top_words_only
1 people trees like instagram ...
2 people yellow like flickrioapp people level water...
...
78 teatree instagram water leith circuits...
我已经计算了 top_words_only
列中单词的频率并将其放入一个元组中,以便 wordcloud 可以像这样将数据处理成可视化:
tuples = tuple([tuple(x) for x in df.top_words_only.str.split(expand=True).stack().value_counts().reset_index().values])
print(tuples)
<OUT>
(('instagram', 3), ('plant', 3), ('shadow', 3), ('rise', 3), .... ('hibs', 1), ('bud', 1), ('insect', 1),
('warriston', 1), ('garage', 1))
wordcloud = WordCloud()
wordcloud.generate_from_frequencies(tuples)
plt.figure()
plt.imshow(wordcloud, interpolation="bilinear")
plt.axis("off")
plt.show()
但是,它出现了一个属性错误:
AttributeError: 'tuple' object has no attribute 'items'
有人知道我的代码有什么问题吗?
使用字典:
d = dict([tuple(x) for x in df.Top_words_only.str.split(expand=True).stack().value_counts().reset_index().values])
from wordcloud import WordCloud
wordcloud = WordCloud()
wordcloud.generate_from_frequencies(d)
plt.figure()
plt.imshow(wordcloud, interpolation="bilinear")
plt.axis("off")
plt.show()
输出:
生成字典的替代方法:
from collections import Counter
d = Counter(w for x in df['Top_words_only'] for w in x.split())