将一个文本文件分成两个文件的函数
Function to divide a text file into two files
我写了一个函数来输入一个文本文件和一个比例(例如80%),将文件的前80%分成一个文件,剩下的20%分成另一个文件。第一部分是正确的,但第二部分是空的。有人可以看一下并告诉我我的错误吗?
def splitFile(inputFilePatheName, outputFilePathNameFirst, outputFilePathNameRest, splitRatio):
lines = 0
buffer = bytearray(2048)
with open(inputFilePatheName) as f:
while f.readinto(buffer) > 0:
lines += buffer.count('\n')
print lines
line80 = int(splitRatio * lines)
print line80
with open(inputFilePatheName) as originalFile:
firstNlines = originalFile.readlines()[0:line80]
restOfTheLines=originalFile.readlines()[(line80+1):lines]
print len(firstNlines)
print len(restOfTheLines)
with open(outputFilePathNameFirst, 'w') as outputFileNLines:
for item in firstNlines:
outputFileNLines.write("{}".format(item))
with open(outputFilePathNameRest,'w') as outputFileRest:
for word in restOfTheLines:
outputFileRest.write("{}".format(word))
我相信这是你的问题:
firstNlines = originalFile.readlines()[0:line80]
restOfTheLines=originalFile.readlines()[(line80+1):lines]
当您第二次调用 readlines() 时,您什么也得不到,因为您已经从文件中读取了所有行。尝试:
allLines = originalFile.readlines()
firstNLines, restOfTheLines = allLines[:line80], allLines[(line80+1):]
当然,对于非常大的文件,存在将整个文件读入内存的问题。
我写了一个函数来输入一个文本文件和一个比例(例如80%),将文件的前80%分成一个文件,剩下的20%分成另一个文件。第一部分是正确的,但第二部分是空的。有人可以看一下并告诉我我的错误吗?
def splitFile(inputFilePatheName, outputFilePathNameFirst, outputFilePathNameRest, splitRatio):
lines = 0
buffer = bytearray(2048)
with open(inputFilePatheName) as f:
while f.readinto(buffer) > 0:
lines += buffer.count('\n')
print lines
line80 = int(splitRatio * lines)
print line80
with open(inputFilePatheName) as originalFile:
firstNlines = originalFile.readlines()[0:line80]
restOfTheLines=originalFile.readlines()[(line80+1):lines]
print len(firstNlines)
print len(restOfTheLines)
with open(outputFilePathNameFirst, 'w') as outputFileNLines:
for item in firstNlines:
outputFileNLines.write("{}".format(item))
with open(outputFilePathNameRest,'w') as outputFileRest:
for word in restOfTheLines:
outputFileRest.write("{}".format(word))
我相信这是你的问题:
firstNlines = originalFile.readlines()[0:line80]
restOfTheLines=originalFile.readlines()[(line80+1):lines]
当您第二次调用 readlines() 时,您什么也得不到,因为您已经从文件中读取了所有行。尝试:
allLines = originalFile.readlines()
firstNLines, restOfTheLines = allLines[:line80], allLines[(line80+1):]
当然,对于非常大的文件,存在将整个文件读入内存的问题。