是什么决定了 "at most size bytes are read and returned" 和 Python read() 的方式？

Question

在 Python 的 input/output 的文档中，它在读取和写入文件下指出：

https://docs.python.org/3.5/tutorial/inputoutput.html#methods-of-file-objects

"When size is omitted or negative, the entire contents of the file will be read and returned; it’s your problem if the file is twice as large as your machine’s memory. Otherwise, at most size bytes are read and returned."

让我们看下面的代码：

size = 1000
with open('file.txt', 'r') as f:
    while True:
        read_data = f.read(size)
        if not read_data:
            break 
        print(read_data)   # outputs data in sizes equal to at most 1000 bytes

这里，size最多1000字节。什么决定了"at most"？

假设我们正在解析结构化数据行。每行是 750 字节。将阅读 "cut off" 下一行，或停在 \n?

Answer 1

read 不是 readline 或 readlines。无论文件内容如何，它都只读取字节（除了行尾翻译，因为您的文件以文本形式打开）

如果缓冲区中有 1000 个字节要读取，则 returns 1000 个字节（如果文件具有 \r\n 格式 (Windows CR+LF) 或更少，并作为文本读取, \r 个字符被删除)
如果还剩 700 个字节，则 returns 700 个字节（考虑 \r 问题）
如果没有什么可读的，它returns一个空缓冲区（len(read_data)==0）。

是什么决定了 "at most size bytes are read and returned" 和 Python read() 的方式？

What determines how "at most size bytes are read and returned" with Python read()?

python

file

input

chunking