尽管文件以 "r" 模式打开，但 seek() 无法正常工作

Question

我有一个包含日期和浮点数（日、月、年、浮点数）的 CSV 文件。这是示例，

1,1,2000,4076.79
2,1,2000,1216.82
3,1,2000,1299.68
4,1,2000,637.36
5,1,2000,3877.91
6,1,2000,3308.99
7,1,2000,2925.93
8,1,2000,1559.09
9,1,2000,3190.81
10,1,2000,3008.66
11,1,2000,2026.35
12,1,2000,3279.61
13,1,2000,3601.6
14,1,2000,2021.1
15,1,2000,2103.62
16,1,2000,609.64
17,1,2000,633.16
18,1,2000,1195.34

我想先读第一行再读最后一行：

handle = open(getInputFileName(), "r")

getInputFileName() obv。是 return 文件名的函数。那么，

print "numberlines", numberLines        #DEBUG# 
>>> 3660

numberLines 是文件中的行数。那么，

handle.seek(0)
lineData = handle.readline().split(",")
print lineData      #DEBUG#
>>> ['1','1','2000','4076.79\n']

到这里一切正常。但是，

handle.seek(numberLines-1)
lineData = handle.readline().split(",")
print lineData      #DEBUG#
>>>['7', '7', '2000', '2347.51\n']

但实际上文件的最后一行是 31,12,2009,3823.02 为什么不搜索一直向下？我尝试删除它卡住的行，但随后程序崩溃了 ValueError: could not convert string to float:（然后我将 lineData 用作浮点数）：

newestDate.insert(1,float(lineData[1]))

如果行有问题，我检查了文件，但格式从未改变。为什么我的代码只适用于第一行而不适用于最后一行？

Answer 1

file.seek(offset[, whence]) 对文件内的字节位置进行操作。不是行号。如果要对行进行操作，请使用 readline() 或迭代文件：

with ("file.txt", "r") as f:
    first = next(f) # see comment Jean-François Fabre
    for last in f:  # and tdelanys comment :o)
        pass # do nothing with all other lines, last will hold the last one

现在 first 和 last 分别保留第一行和最后一行。

这里的优点是你在内存中最多保留 1 行文本并丢弃其余的。据我所知，如果不单步执行文件，就无法简单地获取文件的第一行和最后一行。

如果你想解析数据，请遵循 of using the csv module and a reader - its safer. If you feel adventourous - go for pandas , it has plenty of buildin csv capability :) and is able to read big csv's chunkwise to be more memory friendly (see f.e. How to read a 6 GB csv file with pandas )

Answer 2

不要手动读取 CSV 文件（如果在一行中有任何带逗号的引号项，如 ...,"1,2000",...，您的代码将失败）。有一个 CSV reader：

import csv
with open("foo.csv") as infile:
    reader = csv.reader(infile)
    data = list(reader)

data[0] # First
# ['1', '1', '2000', '4076.79']
data[-1] # Last
#['18', '1', '2000', '1195.34']

如果内存有问题，请阅读第一行，跳过文件的其余部分，并保留最后一行，如其他答案中所述。

尽管文件以 "r" 模式打开，但 seek() 无法正常工作

seek() not working properly, although file opened in "r" mode

python

csv

file-io

seek

python-2.7