频繁地在文件对象上调用 write() 并添加少量内容是不是很糟糕?

Is it bad to call write() on a file object frequently with small additions?

我正在重构一个可怕的 python 脚本,它是生成 lua 绑定的 polycode 项目的一部分。

我正在考虑将生成的 lua 行写成块。

但我的问题是,快速写入文件的 detriments/caveats 是什么?

举个例子:

persistent_file = open('/tmp/demo.txt')

for i in range(1000000):
    persistent_file.write(str(i)*80 + '\n')


for i in range(2000):
    persistent_file.write(str(i)*20 + '\n')


for i in range(1000000):
    persistent_file.write(str(i)*100 + '\n')


persistent_file.close()

这只是一种尽可能快地向文件大量写入的简单方法。 我真的不希望这样做会遇到任何实际问题,但我确实想知道,为一次大写入缓存是否有利?

来自 open 函数的文档:

open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None) -> file object

...

buffering is an optional integer used to set the buffering policy. Pass 0 to switch buffering off (only allowed in binary mode), 1 to select line buffering (only usable in text mode), and an integer > 1 to indicate the size of a fixed-size chunk buffer. When no buffering argument is given, the default buffering policy works as follows:

  • Binary files are buffered in fixed-size chunks; the size of the buffer is chosen using a heuristic trying to determine the underlying device's "block size" and falling back on io.DEFAULT_BUFFER_SIZE. On many systems, the buffer will typically be 4096 or 8192 bytes long.

  • "Interactive" text files (files for which isatty() returns True) use line buffering. Other text files use the policy described above for binary files.

换句话说,在大多数情况下,频繁调用 write() 的唯一开销是函数调用的开销。