Python 如何计算每行中找到的文本的百分比
Python How to calculate percentage of a text found in every row
我有一个包含一列和 4000 行的 CVS 我想制作一个脚本来打印每个唯一的单词及其在该 CSV 上的百分比
示例:
Trojan
Trojan
redirects
Exploits
Trojan
木马:60%
重定向:20%
利用 20%
easy/simple 的方法是什么?
here is a image with the data i have
import csv
myDict = {}
with open('export.csv', 'rb') as csvfile:
for word in csvfile:
if word in myDict:
myDict[word] += 1
else:
myDict[word] = 1
for word in myDict:
print word, float(myDict[word])/len(csvfile)
您可以使用 set 获取所有唯一值并使用 count 获取出现的次数。用文本除以列表的长度得到百分比:
text = ['a', 'a', 'b', 'c']
[(i, text.count(i) * 100. / len(text)) for i in set(text)]
导致:
[('a', 50.0), ('b', 25.0), ('c', 25.0)]
您可以使用如下字典:
import csv
myDict = {}
row_number = 0
with open('some.csv', 'rb') as f:
reader = csv.reader(f, delimiter=' ')
for row in reader:
row_number +=1
if row[0] in myDict:
myDict[row[0]] += 1
else:
myDict[row[0]] = 1
for word in myDict:
print word, float(myDict[word])/row_number
工作如下:
>>> ================================ RESTART ================================
>>>
Trojan 0.6
Exploits 0.2
redirects 0.2
>>>
我有一个包含一列和 4000 行的 CVS 我想制作一个脚本来打印每个唯一的单词及其在该 CSV 上的百分比
示例:
Trojan
Trojan
redirects
Exploits
Trojan
木马:60% 重定向:20% 利用 20%
easy/simple 的方法是什么?
here is a image with the data i have
import csv
myDict = {}
with open('export.csv', 'rb') as csvfile:
for word in csvfile:
if word in myDict:
myDict[word] += 1
else:
myDict[word] = 1
for word in myDict:
print word, float(myDict[word])/len(csvfile)
您可以使用 set 获取所有唯一值并使用 count 获取出现的次数。用文本除以列表的长度得到百分比:
text = ['a', 'a', 'b', 'c']
[(i, text.count(i) * 100. / len(text)) for i in set(text)]
导致:
[('a', 50.0), ('b', 25.0), ('c', 25.0)]
您可以使用如下字典:
import csv
myDict = {}
row_number = 0
with open('some.csv', 'rb') as f:
reader = csv.reader(f, delimiter=' ')
for row in reader:
row_number +=1
if row[0] in myDict:
myDict[row[0]] += 1
else:
myDict[row[0]] = 1
for word in myDict:
print word, float(myDict[word])/row_number
工作如下:
>>> ================================ RESTART ================================
>>>
Trojan 0.6
Exploits 0.2
redirects 0.2
>>>