如何从 txt 文件中查找和提取值?

How to find and extract values from a txt file?

编写一个提示输入文件名的程序,然后打开该文件并通读该文件,查找以下形式的行:

X-DSPAM-Confidence: 0.8475

对这些行进行计数,从每行中提取浮点值,计算这些值的平均值并生成如下所示的输出。不要在您的解决方案中使用 sum() 函数或名为 sum 的变量。*

这是我的代码:

fname = input("Enter a file name:",)
fh = open(fname)
count = 0
# this variable is to add together all the 0.8745's in every line
num = 0
for ln in fh:
    ln = ln.rstrip()
    count += 1
    if not ln.startswith("X-DSPAM-Confidence:    ") : continue
    for num in fh:
        if ln.find(float(0.8475)) == -1:
            num += float(0.8475)
        if not ln.find(float(0.8475)) : break
    # problem: values aren't adding together and gq variable ends up being zero
gq = int(num)
jp = int(count)
avr = (gq)/(jp)
print ("Average spam confidence:",float(avr))

问题是当我 运行 代码时它说有错误,因为 num 的值为零。所以我收到了这个:

ZeroDivisionError: division by zero

当我将 num 的初始值更改为 None 时,出现了类似的问题:

int() argument must be a string or a number, not 'NoneType'

当我把它放在代码的顶部时,python COURSERA autograder 也不接受它:

from __future__ import division

他们给我们的样本数据的文件名是“mbox-short.txt”。这是一个 link http://www.py4e.com/code3/mbox-short.txt

我按如下方式编辑了您的代码。我认为你的任务是找到 X-DSPAM-Confidence: 旁边的数字。我使用您的代码来识别 X-DSPAM-Confidence: 行。然后我用':'分割字符串,然后我取了第一个索引并转换为浮点数。

fname = input("Enter a file name:",)
fh = open(fname)
count = 0
# this variable is to add together all the 0.8745's in every line
num = 0
for ln in fh:
    ln = ln.rstrip()
    if not ln.startswith("X-DSPAM-Confidence:") : continue
    count+=1 
    num += float(ln.split(":")[1])
gq = num
jp = count
avr = (gq)/(jp)
print ("Average spam confidence:",float(avr))
  • 使用 with 打开文件,因此文件会自动关闭。
  • 查看 in-line 条评论。
  • 所需行的格式为 X-DSPAM-Confidence: 0.6961,因此请将它们拆分到 space。
    • 'X-DSPAM-Confidence: 0.6961'.split(' ') 创建一个列表,其中数字位于列表索引 1。
fname = input("Enter a file name:",)
with open(fname) as fh:
    count = 0
    num = 0  # collect and add each found value
    for ln in fh:
        ln = ln.rstrip()
        if not ln.startswith("X-DSPAM-Confidence:"):  # find this string or continue to next ln
            continue
        num += float(ln.split(' ')[1])  # split on the space and add the float
        count += 1  # increment count for each matching line
    avr = num / count  # compute average
    print(f"Average spam confidence: {avr}")  # print value