python 字数统计 (defaultdict) 列未显示
python word count(defaultdict) column not showing
import pandas as pd
from collections import defaultdict
word_name = []
y = 0
text_list = ['france', 'spain', 'spain beaches', 'france beaches', 'spain best beaches']
word_freq = defaultdict(int)
for text in text_list:
for word in text.split():
word_freq[word] += 1
word_name.append(word)
df = pd.DataFrame.from_dict(word_freq, orient='index') \
.sort_values(0, ascending=False) \
.rename(columns={0: 'Word_freq'}) \
.rename(columns={0: 'Word'})
所以我尝试了多种方法将其转换为数据框,但它没有显示单词的列名。我怎样才能表明它?
我不太确定你所说的 "it does not show the column name for the words," 是什么意思,但假设你想正确设置 column/index 名称,你可以这样做:
>>> df = pd.DataFrame.from_dict(word_freq, orient='index')
>>> df = df.rename(columns={0: 'WordFreq'})
>>> df.index.name = 'Word'
>>> df
WordFreq
Word
france 2
spain 3
beaches 3
best 1
您知道 collections 库中的计数器 class 吗?你可以通过使用默认字典的 in-place 来简化你的代码。
from collections import Counter
text_list = ['france', 'spain', 'spain beaches', 'france beaches', 'spain best beaches']
counter_dict = Counter([split_word for word in text_list for split_word in word.split()]
#Counter({'france': 2, 'spain': 3, 'beaches': 3, 'best': 1})
然后使用 to_dict
附件构建您的数据框。
df = pd.DataFrame.from_dict(counter_dict
,
orient="index",
columns=["WordFreq"],
).rename_axis('Word')
WordFreq
Word
france 2
spain 3
beaches 3
best 1
import pandas as pd
from collections import defaultdict
word_name = []
y = 0
text_list = ['france', 'spain', 'spain beaches', 'france beaches', 'spain best beaches']
word_freq = defaultdict(int)
for text in text_list:
for word in text.split():
word_freq[word] += 1
word_name.append(word)
df = pd.DataFrame.from_dict(word_freq, orient='index') \
.sort_values(0, ascending=False) \
.rename(columns={0: 'Word_freq'}) \
.rename(columns={0: 'Word'})
所以我尝试了多种方法将其转换为数据框,但它没有显示单词的列名。我怎样才能表明它?
我不太确定你所说的 "it does not show the column name for the words," 是什么意思,但假设你想正确设置 column/index 名称,你可以这样做:
>>> df = pd.DataFrame.from_dict(word_freq, orient='index')
>>> df = df.rename(columns={0: 'WordFreq'})
>>> df.index.name = 'Word'
>>> df
WordFreq
Word
france 2
spain 3
beaches 3
best 1
您知道 collections 库中的计数器 class 吗?你可以通过使用默认字典的 in-place 来简化你的代码。
from collections import Counter
text_list = ['france', 'spain', 'spain beaches', 'france beaches', 'spain best beaches']
counter_dict = Counter([split_word for word in text_list for split_word in word.split()]
#Counter({'france': 2, 'spain': 3, 'beaches': 3, 'best': 1})
然后使用 to_dict
附件构建您的数据框。
df = pd.DataFrame.from_dict(counter_dict
,
orient="index",
columns=["WordFreq"],
).rename_axis('Word')
WordFreq
Word
france 2
spain 3
beaches 3
best 1