将字符串转换为浮点数以求和时忽略所有非浮点值

Ignoring all the non float value when convert string to float to sum them

我正在尝试将所有字符串值转换为浮点值以进行添加,但是我的 csv 中存在字符串,并且 returns 此错误:

L.append([i, j, sum(i for i in map(float, filter(None, 
k)) if i in {0.5, 1, 2})])
ValueError: could not convert string to float: 'l0g0dim'

如何忽略包含字母、逗号和其他不可转换字符的字符串并将它们相加而不会出现问题?

样本file3.csv

1754|2014-06-13 07:00:00|0|0.5|0
1754|2014-06-13 08:00:00|0|2|0.5
1754|2014-06-13 09:00:00|0|a0|b0|2
1278|2014-01-26 18:00:00|light subcoatal draft ...|0|0|2|0.5
1754|2014-06-04 19:00:00|0|leg dim|0|0
1754|2014-06-13 10:00:00|0|(C) fins|0|0

代码

    import csv
    import re
    import time
    from io import StringIO

replacements = (
("(B)", "0"), ("(D)", "1"), ("Entrée air absente", "2"),

("+", "0.5"), ("++", "1"), ("+++", "2"),

("(S) +", "0.5"), ("(S) expi. ++", "1"), ("(S) +++", "2"),

("100", "0"), ("99", "0"), ("98", "0"), ("97", "0"), ("96", "0"),
("95", "0"), ("94", "1"), ("93", "1"),
("92", "1"), ("91", "1"), ("90", "1"), ("89", "1"),
("Bruits de transmission", "0"), ("Fatigué/épuisé", "0"), (" *", "0"), ("tirage sous costal", "0"), ("léger BAN", "0"),   )

replvalues = dict(replacements)
regex = "|".join(map(re.escape,                                 
replvalues.keys()))
repl = lambda x: replvalues.get(x.string[x.start():x.end()])

with open("file3.csv", "r", encoding='utf-8') as f_in, \
    open("file4.csv", "w", encoding='utf-8') as f_out:


    for i in f_in:
        line = re.sub(regex, repl, i)
        f_out.write(line)

with open("file4.csv", "r", encoding='utf-8') as f_in, \
     open("file5.csv", "w", encoding='utf-8') as f_out:
L = []
with f_in as fin:
    reader = csv.reader(fin, delimiter='|')
    for i, j, *k in reader:
      L.append([i, j, sum(i for i in map(float, filter(None, k)) if i in {0.5, 1, 2})])
    f_out.write(L)
print(L)

放这部分

for i, j, *k in reader:
    L.append([i, j, sum(i for i in map(float, filter(None, k)) if i in {0.5, 1, 2})])
f_out.write(L)

进入 try....except 块,检查所需值是否为浮点数。如果是,追加,如果不是,继续下一个值。

为什么过滤两次?

filter(None, k)           # filter
(... if i in {0.5, 1, 2}) # another filter

您可以过滤一次并简单地删除非浮点值,方法是检查一组字符串中的适当值(CSV 中的值 预期 ):

L.append([i, j, sum(float(i) for x in k if x in {'0.5', '1', '2'})])
                                       #|<-      filter       ->|  

不兼容的字符串值,例如字符串 'leg dim' 永远不会通过过滤器,因此不会进入最终的 float 调用。