如何 return 给定 .csv 文件的值的总和?
How to return the sum of the values for a given .csv file?
我有一个这样的 .csv 文件,它包含单词和值:
string1, 102, 90, 23
string2, 89, 45, 21
...
hi, 1, 3, 5
example, 2, 0, 2
someone, 1, 1, 1
hope, 0, 0, 0
stringN, 923, 23892, 9292
stringnN-1, 2903, 49058, 4859
还有一大堆像这样的单词:
lis__ = [[Hi this is an example, this site is nice!.],...,[I hope someone can help]]
如何 return 出现在 lis__
中的每个单词的值的总和。对于上述实例,输出将是这样的:
对于第一个子列表:
[Hi this is an example, this site is nice!.]
In:
hi, 1, 3, 5
example, 2, 0, 2
someone, 1, 1, 1
hope, 0, 0, 0
Then add value one with value one, two with two and three with three:
Out:
[(3,3,7)]
然后为第二个子列表添加值 1 和值 1,2 和 2,3 和 3:
In:
[I hope someone can help]
hi, 1, 3, 5
example, 2, 0, 2
someone, 1, 1, 1
hope, 0, 0, 0
out:
[(1,1,1)]
最后:
[(3,3,7),...,(1,1,1)]
其中 ...
是无限的字符串或元组。可能这个任务可以用 csv
模块来完成,知道如何处理这个吗?提前谢谢大家!
怎么样:
import csv
import re
class Score(object):
def __init__(self, *args):
self.lst = args
def __repr__(self):
return str(tuple(self.lst))
def __iadd__(self, other):
new = [self.lst[i] + other.lst[i] for i in range(3)]
return Score(*new)
lis__ = [
'Hi this is an example, this site is nice!.',
'I hope someone can help',
]
# Build word_scores dictionary, keyed by word
word_scores = {}
with open('yourcsv.csv') as f:
reader = csv.reader(f)
for line in reader:
word_scores[line[0].lower()] = Score(*map(int, line[1:]))
# Loop over lis__, computing the total score for each element (elem_score),
# append it to line_scores
line_scores = []
for elem in lis__:
elem_score = Score(0,0,0)
for word in re.split(r'[^\w]+', elem):
try:
score = word_scores[word.lower()]
print(" Found: %s %s" % (word.lower(), score))
elem_score += score
except KeyError:
pass
print("%s : %s" % (elem_score, elem))
line_scores.append(elem_score)
print
print "Line Scores:"
print line_scores
输出:
Found: hi (1, 3, 5)
Found: example (2, 0, 2)
(3, 3, 7) : Hi this is an example, this site is nice!.
Found: hope (0, 0, 0)
Found: someone (1, 1, 1)
(1, 1, 1) : I hope someone can help
Line Scores:
[(3, 3, 7), (1, 1, 1)]
我有一个这样的 .csv 文件,它包含单词和值:
string1, 102, 90, 23
string2, 89, 45, 21
...
hi, 1, 3, 5
example, 2, 0, 2
someone, 1, 1, 1
hope, 0, 0, 0
stringN, 923, 23892, 9292
stringnN-1, 2903, 49058, 4859
还有一大堆像这样的单词:
lis__ = [[Hi this is an example, this site is nice!.],...,[I hope someone can help]]
如何 return 出现在 lis__
中的每个单词的值的总和。对于上述实例,输出将是这样的:
对于第一个子列表:
[Hi this is an example, this site is nice!.]
In:
hi, 1, 3, 5
example, 2, 0, 2
someone, 1, 1, 1
hope, 0, 0, 0
Then add value one with value one, two with two and three with three:
Out:
[(3,3,7)]
然后为第二个子列表添加值 1 和值 1,2 和 2,3 和 3:
In:
[I hope someone can help]
hi, 1, 3, 5
example, 2, 0, 2
someone, 1, 1, 1
hope, 0, 0, 0
out:
[(1,1,1)]
最后:
[(3,3,7),...,(1,1,1)]
其中 ...
是无限的字符串或元组。可能这个任务可以用 csv
模块来完成,知道如何处理这个吗?提前谢谢大家!
怎么样:
import csv
import re
class Score(object):
def __init__(self, *args):
self.lst = args
def __repr__(self):
return str(tuple(self.lst))
def __iadd__(self, other):
new = [self.lst[i] + other.lst[i] for i in range(3)]
return Score(*new)
lis__ = [
'Hi this is an example, this site is nice!.',
'I hope someone can help',
]
# Build word_scores dictionary, keyed by word
word_scores = {}
with open('yourcsv.csv') as f:
reader = csv.reader(f)
for line in reader:
word_scores[line[0].lower()] = Score(*map(int, line[1:]))
# Loop over lis__, computing the total score for each element (elem_score),
# append it to line_scores
line_scores = []
for elem in lis__:
elem_score = Score(0,0,0)
for word in re.split(r'[^\w]+', elem):
try:
score = word_scores[word.lower()]
print(" Found: %s %s" % (word.lower(), score))
elem_score += score
except KeyError:
pass
print("%s : %s" % (elem_score, elem))
line_scores.append(elem_score)
print
print "Line Scores:"
print line_scores
输出:
Found: hi (1, 3, 5) Found: example (2, 0, 2) (3, 3, 7) : Hi this is an example, this site is nice!. Found: hope (0, 0, 0) Found: someone (1, 1, 1) (1, 1, 1) : I hope someone can help Line Scores: [(3, 3, 7), (1, 1, 1)]