是什么决定了 "at most size bytes are read and returned" 和 Python read() 的方式?

What determines how "at most size bytes are read and returned" with Python read()?

在 Python 的 input/output 的文档中,它在读取和写入文件下指出:

https://docs.python.org/3.5/tutorial/inputoutput.html#methods-of-file-objects

"When size is omitted or negative, the entire contents of the file will be read and returned; it’s your problem if the file is twice as large as your machine’s memory. Otherwise, at most size bytes are read and returned."

让我们看下面的代码:

size = 1000
with open('file.txt', 'r') as f:
    while True:
        read_data = f.read(size)
        if not read_data:
            break 
        print(read_data)   # outputs data in sizes equal to at most 1000 bytes

这里,size最多1000字节。什么决定了"at most"?

假设我们正在解析结构化数据行。每行是 750 字节。将阅读 "cut off" 下一行,或停在 \n?

read 不是 readlinereadlines。无论文件内容如何,​​它都只读取字节(除了行尾翻译,因为您的文件以文本形式打开)

  • 如果缓冲区中有 1000 个字节要读取,则 returns 1000 个字节(如果文件具有 \r\n 格式 (Windows CR+LF) 或更少,并作为文本读取, \r 个字符被删除)
  • 如果还剩 700 个字节,则 returns 700 个字节(考虑 \r 问题)
  • 如果没有什么可读的,它returns一个空缓冲区(len(read_data)==0)。