Wordcloud 和列表列表
Wordcloud and list of list
我正在尝试使用以下数据创建词云:
my list= [[],
[],
[],
[],
[],
[],
[],
[],
[],
[],
[],
[],
[],
[],
[],
[],
[],
[],
[],
['EMF'],
['body'],
[],
[],
[],
[],
['water', 'juice'],
['What', 'are', 'u', 'doing'],
[],
[],
[],
[],
[],
[],
[],
['EVENT'],
['christmas'],
[],
['shalala'],
['happy'],
[]]
通常我会
import numpy as np
import pandas as pd
from PIL import Image
from wordcloud import WordCloud, STOPWORDS, ImageColorGenerator
import matplotlib.pyplot as plt
corpus = " ".join(x for x in my_list)
df_wordcloud = WordCloud(background_color='white',max_font_size = 50).generate(corpus)
plt.imshow(df_wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()
但是这次我得到了错误:
TypeError: sequence item 0: expected str instance, list found
你知道如何从列表列表创建词云吗?
执行 " ".join(x for x in my_list)
实际上等效于 " ".join(my_list)
,它试图将所有子列表连接在一起,导致您看到的错误。
相反,您想“扁平化”那些内部列表,首先将它们变成字符串:
corpus = " ".join(" ".join(x) for x in my_list)
甚至:
corpus = " ".join(map(" ".join, my_list))
让我们试试explode
s = pd.Series(l).explode().dropna()
Out[195]:
19 EMF
20 body
25 water
25 juice
26 What
26 are
26 u
26 doing
34 EVENT
35 christmas
37 shalala
38 happy
dtype: object
我正在尝试使用以下数据创建词云:
my list= [[],
[],
[],
[],
[],
[],
[],
[],
[],
[],
[],
[],
[],
[],
[],
[],
[],
[],
[],
['EMF'],
['body'],
[],
[],
[],
[],
['water', 'juice'],
['What', 'are', 'u', 'doing'],
[],
[],
[],
[],
[],
[],
[],
['EVENT'],
['christmas'],
[],
['shalala'],
['happy'],
[]]
通常我会
import numpy as np
import pandas as pd
from PIL import Image
from wordcloud import WordCloud, STOPWORDS, ImageColorGenerator
import matplotlib.pyplot as plt
corpus = " ".join(x for x in my_list)
df_wordcloud = WordCloud(background_color='white',max_font_size = 50).generate(corpus)
plt.imshow(df_wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()
但是这次我得到了错误:
TypeError: sequence item 0: expected str instance, list found
你知道如何从列表列表创建词云吗?
执行 " ".join(x for x in my_list)
实际上等效于 " ".join(my_list)
,它试图将所有子列表连接在一起,导致您看到的错误。
相反,您想“扁平化”那些内部列表,首先将它们变成字符串:
corpus = " ".join(" ".join(x) for x in my_list)
甚至:
corpus = " ".join(map(" ".join, my_list))
让我们试试explode
s = pd.Series(l).explode().dropna()
Out[195]:
19 EMF
20 body
25 water
25 juice
26 What
26 are
26 u
26 doing
34 EVENT
35 christmas
37 shalala
38 happy
dtype: object