为什么在使用 OS 和磁盘缓冲区写入文件后读取操作要快得多?

Why are read operations much faster after writing files using OS and disk buffers?

我正在使用 CreateFile()WriteFile() 依次将大约 100 个大小为 50MB 的文件写入磁盘上的目录。在第二步中,使用 CreateFile()ReadFile().

读取这些文件的内容

我注意到一些部分奇怪的事情:
如果我在写入文件时传递 FILE_FLAG_NO_BUFFERING | FILE_FLAG_WRITE_THROUGH ,读取需要很长时间(通常是数百毫秒)。但是,当我不传递这些标志(而是使用 FlushFileBuffers() 时),写入似乎以大致相同的速度发生,但写入后读取这些文件的速度非常快(每个文件不到 20 毫秒!)。

这怎么可能?写入 5000MB 数据时传递的标志如何影响以后的读取?磁盘是否在其缓存中缓存了整个 5GB?

当您传递 FILE_FLAG_NO_BUFFERING 时,您是在告诉系统不要将数据放入其磁盘缓存中。那么当你读取数据的时候,系统不得不从磁盘中获取数据。

当您省略FILE_FLAG_NO_BUFFERING时,系统可以将数据放入其磁盘缓存中。所以后续读取数据时,可以直接从内存中读取,比磁盘更快。

来自https://support.microsoft.com/en-us/kb/99794

The FILE_FLAG_WRITE_THROUGH flag for CreateFile() causes any writes made to that handle to be written directly to the file without being buffered. The data is cached (stored in the disk cache); however, it is still written directly to the file. This method allows a read operation on that data to satisfy the read request from cached data (if it's still there), rather than having to do a file read to get the data. The write call doesn't return until the data is written to the file. This applies to remote writes as well--the network redirector passes the FILE_FLAG_WRITE_THROUGH flag to the server so that the server knows not to satisfy the write request until the data is written to the file.

The FILE_FLAG_NO_BUFFERING takes this concept one step further and eliminates all read-ahead file buffering and disk caching as well, so that all reads are guaranteed to come from the file and not from any system buffer or disk cache.

您可能会对 Raymond Chen 的这篇文章感兴趣:We’re currently using FILE_FLAG_NO_BUFFERING and FILE_FLAG_WRITE_THROUGH, but we would like our WriteFile to go even faster。摘录:

A customer said that their program’s I/O pattern is to open a file and then every so often write about 100KB of data into the file. They are currently using the FILE_FLAG_NO_BUFFERING and FILE_FLAG_WRITE_THROUGH flags to open a file, and they wanted to know what else they could do to make their writes go even faster.

Um, for one thing, you stop passing those two flags!

Those two flags in combination basically mean “Give me the slowest possible I/O performance!” because they force all I/O to go through to the physical media right away.