Python 文件读取问题,可能是文件循环?

Python File reading problem, Possible infile loop?

问题如下; "编写一个 Python 程序来读取一个包含湖泊和鱼类数据的文件并设置报告 表格格式的湖泊标识号、湖泊名称和鱼重(使用 带格式的字符串区域)。该程序应计算鱼的平均重量 报道。

湖泊识别;

1000 Chemo
1100 Greene
1200 Toddy

我必须阅读的文件"FishWeights.txt"包含以下数据;

1000 4.0
1100 2.0
1200 1.5
1000 2.0
1000 2.2
1100 1.9
1200 2.8

我的代码;

f = open("fishweights.txt")
print(f.read(4), "Chemo", f.readline(4))
print(f.read(5), "Greene", f.read(5))
print(f.read(4), "Toddy", f.read(5))
print(f.read(5), "Chemo", f.read(4))
print(f.read(5), "Chemo", f.read(4))
print(f.read(5), "Greene", f.read(4))
print(f.read(5), "Toddy", f.read(4))

我收到的输出是;

1000 Chemo  4.0

1100 Greene  2.0

1200 Toddy  1.5

1000  Chemo 2.0

1000  Chemo 2.2

1100  Greene 1.9

1200  Toddy 2.8 

这是正确的,因为我必须显示湖泊的 ID 号、名称和每个湖泊的鱼重。但我需要能够进行计算,最终计算出所有鱼的平均重量。 输出的格式应该整齐,如下所示;

1000     Chemo      4.0
1100     Greene     2.0
1200     Toddy      1.5
1000     Chemo      2.0
1000     Chemo      2.2
1100     Greene     1.9
1200     Toddy      2.8
The average fish weight is: 2.34

感谢任何帮助,这里只是一个初学者,寻求帮助以全面了解该主题。谢谢!

是的,您需要遍历行。这是您正在寻找的结构:

with open("fishweights.txt") as fo:
    for line in fo:
        pass

现在为了检索每一行的每一部分,您可以使用 line.split()。假设 ids 的长度是固定的,读取固定数量的字节(就像你所做的那样)是好的。你确定每个 id 总是恰好有 4 位数字吗?这样的东西可能会更好:

raw_data = []
with open("fishweights.txt") as fo:
    for line in fo:
        row = line.strip().split()
        if not row:
            continue  # ignore empty lines
        id = int(row[0])
        no = float(row[1])
        raw_data.append((id, no))

现在您已经有了原始数据,您需要对其进行聚合:

sum = 0
count = 0
for id, no in raw_data:
    sum += no
    count += 1
avg = sum / count

或单行

avg = sum(no for id, no in raw_data) / len(raw_data)

最后,您需要将 ID 映射到最终打印的名称:

id_to_name = {
    1000: 'Chemo',
    1100: 'Greene',
    1200: 'Toddy',
}
for id, no in raw_data:
    print(id, id_to_name[id], no)
print('Average: ', avg)

当然三个循环可以合并为一个循环。我把它分开了,这样你就可以清楚地看到代码的每个阶段。最终(经过一些优化)的结果可能如下所示:

id_to_name = {
    1000: 'Chemo',
    1100: 'Greene',
    1200: 'Toddy',
}
sum = 0
count = 0
with open("fishweights.txt") as fo:
    for line in fo:
        row = line.strip().split()
        if not row:
            continue  # ignore empty lines
        id = int(row[0])
        no = float(row[1])
        sum += no
        count += 1
        print(id, id_to_name[id], no)
print('Average:', sum/count)

您不需要使用偏移量来读取行。此外,您可以使用 with 来确保文件在您完成后关闭。对于平均值,您可以将所有数字放在一个列表中,然后在最后找到平均值。使用字典将湖泊 ID 映射到名称:

lakes = {
    1000: "Chemo",
    1100: "Greene",
    1200: "Toddy"
}
allWeights = []

with open("test.txt", "r") as f:
    for line in f:
        line = line.strip()  # get rid of any whitespace at the end of the line
        line = line.split()

        lake, weight = line
        lake = int(lake)
        weight = float(weight)
        print(lake, lakes[lake], weight, sep="\t")
        allWeights.append(weight)

avg = sum(allWeights) / len(allWeights)
print("The average fish weight is: {0:.2f}".format(avg)) # format to 2 decimal places

输出:

1000    Chemo   4.0
1100    Greene  2.0
1200    Toddy   1.5
1000    Chemo   2.0
1000    Chemo   2.2
1100    Greene  1.9
1200    Toddy   2.8
The average fish weight is: 2.34

有更有效的方法来执行此操作,但这可能是帮助您了解正在发生的事情的最简单方法。

您可以将湖泊名称存储到字典中,将数据存储到列表中。在这个例子中,您只需从那里循环遍历您的列表 fish 并获取与 id 对应的湖泊名称。最后通过将列表中的 weight 相加并将其除以 fish.

的长度来打印您的平​​均值
with open('LakeID.txt','r') as l:
    lake = l.readlines()
    lake = dict([i.rstrip('\n').split() for i in lake])

with open('FishWeights.txt','r') as f:
    fish = f.readlines()
    fish = [i.rstrip('\n').split() for i in fish]

for i in fish:
    print(i[0],lake[i[0]],i[1])    

print('The total average is {}'.format(sum(float(i[1]) for i in fish)/len(fish))) 

我们还鼓励您使用 with open(..) 上下文管理器来确保文件在退出时关闭。

所以在这里您可以将鱼的重量和湖泊数据存储在两个数组中。请参阅以下内容,它读取每一行,然后将它们拆分为鱼重列表和湖泊数据列表。

text=f.readlines()
fishWeights=[] 
lakeData=[]
for item in text:
    fishWeights.append(item.split(' ')[1])
    lakeData.append(item.split(' ')[1])

从这里你可以用

输出信息
for i in range(len(fishWeights)) :
    print(lakeData[i], "Your Text", fishWeights[i])

你可以用

算出你的平均值
total=0
for weight in fishWeights:
    total+=weight
total/=len(fishWeights) 

使用dataframe可以轻松实现。 请在下面找到示例代码。

import pandas as pd

# load lake data into a dataframe
lakeDF = pd.read_csv('Lake.txt', sep=" ", header=None)
lakeDF.columns = ["Lake ID", "Lake Name"]
#load fish data into a dataframe
fishWeightDF = pd.read_csv('FishWeights.txt', sep=" ", header=None)
fishWeightDF.columns = ["Lake ID", "Fish Weight"]
#sort fishweight with 'Lake ID' (common field in both lake and fish)
fishWeightDF = fishWeightDF.sort_values(by= ['Lake ID'],ascending=True)
# join fish with lake
mergedFrame = pd.merge_asof(
    fishWeightDF, lakeDF,
    on='Lake ID'
    )
#print the result
print(mergedFrame)
#find the average
average = mergedFrame['Fish Weight'].mean()
print(average)