Python 文件读取问题,可能是文件循环?
Python File reading problem, Possible infile loop?
问题如下; "编写一个 Python 程序来读取一个包含湖泊和鱼类数据的文件并设置报告
表格格式的湖泊标识号、湖泊名称和鱼重(使用
带格式的字符串区域)。该程序应计算鱼的平均重量
报道。
湖泊识别;
1000 Chemo
1100 Greene
1200 Toddy
我必须阅读的文件"FishWeights.txt"包含以下数据;
1000 4.0
1100 2.0
1200 1.5
1000 2.0
1000 2.2
1100 1.9
1200 2.8
我的代码;
f = open("fishweights.txt")
print(f.read(4), "Chemo", f.readline(4))
print(f.read(5), "Greene", f.read(5))
print(f.read(4), "Toddy", f.read(5))
print(f.read(5), "Chemo", f.read(4))
print(f.read(5), "Chemo", f.read(4))
print(f.read(5), "Greene", f.read(4))
print(f.read(5), "Toddy", f.read(4))
我收到的输出是;
1000 Chemo 4.0
1100 Greene 2.0
1200 Toddy 1.5
1000 Chemo 2.0
1000 Chemo 2.2
1100 Greene 1.9
1200 Toddy 2.8
这是正确的,因为我必须显示湖泊的 ID 号、名称和每个湖泊的鱼重。但我需要能够进行计算,最终计算出所有鱼的平均重量。
输出的格式应该整齐,如下所示;
1000 Chemo 4.0
1100 Greene 2.0
1200 Toddy 1.5
1000 Chemo 2.0
1000 Chemo 2.2
1100 Greene 1.9
1200 Toddy 2.8
The average fish weight is: 2.34
感谢任何帮助,这里只是一个初学者,寻求帮助以全面了解该主题。谢谢!
是的,您需要遍历行。这是您正在寻找的结构:
with open("fishweights.txt") as fo:
for line in fo:
pass
现在为了检索每一行的每一部分,您可以使用 line.split()
。假设 ids 的长度是固定的,读取固定数量的字节(就像你所做的那样)是好的。你确定每个 id 总是恰好有 4 位数字吗?这样的东西可能会更好:
raw_data = []
with open("fishweights.txt") as fo:
for line in fo:
row = line.strip().split()
if not row:
continue # ignore empty lines
id = int(row[0])
no = float(row[1])
raw_data.append((id, no))
现在您已经有了原始数据,您需要对其进行聚合:
sum = 0
count = 0
for id, no in raw_data:
sum += no
count += 1
avg = sum / count
或单行
avg = sum(no for id, no in raw_data) / len(raw_data)
最后,您需要将 ID 映射到最终打印的名称:
id_to_name = {
1000: 'Chemo',
1100: 'Greene',
1200: 'Toddy',
}
for id, no in raw_data:
print(id, id_to_name[id], no)
print('Average: ', avg)
当然三个循环可以合并为一个循环。我把它分开了,这样你就可以清楚地看到代码的每个阶段。最终(经过一些优化)的结果可能如下所示:
id_to_name = {
1000: 'Chemo',
1100: 'Greene',
1200: 'Toddy',
}
sum = 0
count = 0
with open("fishweights.txt") as fo:
for line in fo:
row = line.strip().split()
if not row:
continue # ignore empty lines
id = int(row[0])
no = float(row[1])
sum += no
count += 1
print(id, id_to_name[id], no)
print('Average:', sum/count)
您不需要使用偏移量来读取行。此外,您可以使用 with
来确保文件在您完成后关闭。对于平均值,您可以将所有数字放在一个列表中,然后在最后找到平均值。使用字典将湖泊 ID 映射到名称:
lakes = {
1000: "Chemo",
1100: "Greene",
1200: "Toddy"
}
allWeights = []
with open("test.txt", "r") as f:
for line in f:
line = line.strip() # get rid of any whitespace at the end of the line
line = line.split()
lake, weight = line
lake = int(lake)
weight = float(weight)
print(lake, lakes[lake], weight, sep="\t")
allWeights.append(weight)
avg = sum(allWeights) / len(allWeights)
print("The average fish weight is: {0:.2f}".format(avg)) # format to 2 decimal places
输出:
1000 Chemo 4.0
1100 Greene 2.0
1200 Toddy 1.5
1000 Chemo 2.0
1000 Chemo 2.2
1100 Greene 1.9
1200 Toddy 2.8
The average fish weight is: 2.34
有更有效的方法来执行此操作,但这可能是帮助您了解正在发生的事情的最简单方法。
您可以将湖泊名称存储到字典中,将数据存储到列表中。在这个例子中,您只需从那里循环遍历您的列表 fish
并获取与 id
对应的湖泊名称。最后通过将列表中的 weight
相加并将其除以 fish
.
的长度来打印您的平均值
with open('LakeID.txt','r') as l:
lake = l.readlines()
lake = dict([i.rstrip('\n').split() for i in lake])
with open('FishWeights.txt','r') as f:
fish = f.readlines()
fish = [i.rstrip('\n').split() for i in fish]
for i in fish:
print(i[0],lake[i[0]],i[1])
print('The total average is {}'.format(sum(float(i[1]) for i in fish)/len(fish)))
我们还鼓励您使用 with open(..)
上下文管理器来确保文件在退出时关闭。
所以在这里您可以将鱼的重量和湖泊数据存储在两个数组中。请参阅以下内容,它读取每一行,然后将它们拆分为鱼重列表和湖泊数据列表。
text=f.readlines()
fishWeights=[]
lakeData=[]
for item in text:
fishWeights.append(item.split(' ')[1])
lakeData.append(item.split(' ')[1])
从这里你可以用
输出信息
for i in range(len(fishWeights)) :
print(lakeData[i], "Your Text", fishWeights[i])
你可以用
算出你的平均值
total=0
for weight in fishWeights:
total+=weight
total/=len(fishWeights)
使用dataframe可以轻松实现。
请在下面找到示例代码。
import pandas as pd
# load lake data into a dataframe
lakeDF = pd.read_csv('Lake.txt', sep=" ", header=None)
lakeDF.columns = ["Lake ID", "Lake Name"]
#load fish data into a dataframe
fishWeightDF = pd.read_csv('FishWeights.txt', sep=" ", header=None)
fishWeightDF.columns = ["Lake ID", "Fish Weight"]
#sort fishweight with 'Lake ID' (common field in both lake and fish)
fishWeightDF = fishWeightDF.sort_values(by= ['Lake ID'],ascending=True)
# join fish with lake
mergedFrame = pd.merge_asof(
fishWeightDF, lakeDF,
on='Lake ID'
)
#print the result
print(mergedFrame)
#find the average
average = mergedFrame['Fish Weight'].mean()
print(average)
问题如下; "编写一个 Python 程序来读取一个包含湖泊和鱼类数据的文件并设置报告 表格格式的湖泊标识号、湖泊名称和鱼重(使用 带格式的字符串区域)。该程序应计算鱼的平均重量 报道。
湖泊识别;
1000 Chemo
1100 Greene
1200 Toddy
我必须阅读的文件"FishWeights.txt"包含以下数据;
1000 4.0
1100 2.0
1200 1.5
1000 2.0
1000 2.2
1100 1.9
1200 2.8
我的代码;
f = open("fishweights.txt")
print(f.read(4), "Chemo", f.readline(4))
print(f.read(5), "Greene", f.read(5))
print(f.read(4), "Toddy", f.read(5))
print(f.read(5), "Chemo", f.read(4))
print(f.read(5), "Chemo", f.read(4))
print(f.read(5), "Greene", f.read(4))
print(f.read(5), "Toddy", f.read(4))
我收到的输出是;
1000 Chemo 4.0
1100 Greene 2.0
1200 Toddy 1.5
1000 Chemo 2.0
1000 Chemo 2.2
1100 Greene 1.9
1200 Toddy 2.8
这是正确的,因为我必须显示湖泊的 ID 号、名称和每个湖泊的鱼重。但我需要能够进行计算,最终计算出所有鱼的平均重量。 输出的格式应该整齐,如下所示;
1000 Chemo 4.0
1100 Greene 2.0
1200 Toddy 1.5
1000 Chemo 2.0
1000 Chemo 2.2
1100 Greene 1.9
1200 Toddy 2.8
The average fish weight is: 2.34
感谢任何帮助,这里只是一个初学者,寻求帮助以全面了解该主题。谢谢!
是的,您需要遍历行。这是您正在寻找的结构:
with open("fishweights.txt") as fo:
for line in fo:
pass
现在为了检索每一行的每一部分,您可以使用 line.split()
。假设 ids 的长度是固定的,读取固定数量的字节(就像你所做的那样)是好的。你确定每个 id 总是恰好有 4 位数字吗?这样的东西可能会更好:
raw_data = []
with open("fishweights.txt") as fo:
for line in fo:
row = line.strip().split()
if not row:
continue # ignore empty lines
id = int(row[0])
no = float(row[1])
raw_data.append((id, no))
现在您已经有了原始数据,您需要对其进行聚合:
sum = 0
count = 0
for id, no in raw_data:
sum += no
count += 1
avg = sum / count
或单行
avg = sum(no for id, no in raw_data) / len(raw_data)
最后,您需要将 ID 映射到最终打印的名称:
id_to_name = {
1000: 'Chemo',
1100: 'Greene',
1200: 'Toddy',
}
for id, no in raw_data:
print(id, id_to_name[id], no)
print('Average: ', avg)
当然三个循环可以合并为一个循环。我把它分开了,这样你就可以清楚地看到代码的每个阶段。最终(经过一些优化)的结果可能如下所示:
id_to_name = {
1000: 'Chemo',
1100: 'Greene',
1200: 'Toddy',
}
sum = 0
count = 0
with open("fishweights.txt") as fo:
for line in fo:
row = line.strip().split()
if not row:
continue # ignore empty lines
id = int(row[0])
no = float(row[1])
sum += no
count += 1
print(id, id_to_name[id], no)
print('Average:', sum/count)
您不需要使用偏移量来读取行。此外,您可以使用 with
来确保文件在您完成后关闭。对于平均值,您可以将所有数字放在一个列表中,然后在最后找到平均值。使用字典将湖泊 ID 映射到名称:
lakes = {
1000: "Chemo",
1100: "Greene",
1200: "Toddy"
}
allWeights = []
with open("test.txt", "r") as f:
for line in f:
line = line.strip() # get rid of any whitespace at the end of the line
line = line.split()
lake, weight = line
lake = int(lake)
weight = float(weight)
print(lake, lakes[lake], weight, sep="\t")
allWeights.append(weight)
avg = sum(allWeights) / len(allWeights)
print("The average fish weight is: {0:.2f}".format(avg)) # format to 2 decimal places
输出:
1000 Chemo 4.0
1100 Greene 2.0
1200 Toddy 1.5
1000 Chemo 2.0
1000 Chemo 2.2
1100 Greene 1.9
1200 Toddy 2.8
The average fish weight is: 2.34
有更有效的方法来执行此操作,但这可能是帮助您了解正在发生的事情的最简单方法。
您可以将湖泊名称存储到字典中,将数据存储到列表中。在这个例子中,您只需从那里循环遍历您的列表 fish
并获取与 id
对应的湖泊名称。最后通过将列表中的 weight
相加并将其除以 fish
.
with open('LakeID.txt','r') as l:
lake = l.readlines()
lake = dict([i.rstrip('\n').split() for i in lake])
with open('FishWeights.txt','r') as f:
fish = f.readlines()
fish = [i.rstrip('\n').split() for i in fish]
for i in fish:
print(i[0],lake[i[0]],i[1])
print('The total average is {}'.format(sum(float(i[1]) for i in fish)/len(fish)))
我们还鼓励您使用 with open(..)
上下文管理器来确保文件在退出时关闭。
所以在这里您可以将鱼的重量和湖泊数据存储在两个数组中。请参阅以下内容,它读取每一行,然后将它们拆分为鱼重列表和湖泊数据列表。
text=f.readlines()
fishWeights=[]
lakeData=[]
for item in text:
fishWeights.append(item.split(' ')[1])
lakeData.append(item.split(' ')[1])
从这里你可以用
输出信息for i in range(len(fishWeights)) :
print(lakeData[i], "Your Text", fishWeights[i])
你可以用
算出你的平均值total=0
for weight in fishWeights:
total+=weight
total/=len(fishWeights)
使用dataframe可以轻松实现。 请在下面找到示例代码。
import pandas as pd
# load lake data into a dataframe
lakeDF = pd.read_csv('Lake.txt', sep=" ", header=None)
lakeDF.columns = ["Lake ID", "Lake Name"]
#load fish data into a dataframe
fishWeightDF = pd.read_csv('FishWeights.txt', sep=" ", header=None)
fishWeightDF.columns = ["Lake ID", "Fish Weight"]
#sort fishweight with 'Lake ID' (common field in both lake and fish)
fishWeightDF = fishWeightDF.sort_values(by= ['Lake ID'],ascending=True)
# join fish with lake
mergedFrame = pd.merge_asof(
fishWeightDF, lakeDF,
on='Lake ID'
)
#print the result
print(mergedFrame)
#find the average
average = mergedFrame['Fish Weight'].mean()
print(average)