从文件中读取零长度
zero length read from file
我有两个进程,一个正在写入(追加)文件,另一个正在从文件中读取。两个进程同时 运行,但不通信。另一个 reader 进程可能会在编写器进程完成之前启动。
这种方法有效,但 read() 通常 returns 读取零字节且没有错误。它们的零长度读取与非零长度读取的比率很高,效率低下。
有什么办法解决这个问题吗?这是在 POSIX 文件系统上。
如果没有通信通道,则无法保证在读取正在写入的文件时防止零字节读取甚至长时间挂起而不读取任何数据。 tail
的 Linux 实现使用 inotify
有效地创建通信通道并获取有关文件写入的信息 activity.
这是一个非常有趣的问题,IBM 甚至 published a Redbook 描述了一个能够在大约 15 GB/sec:
时做到这一点的实现 "read-behind-write"
Read-behind-write is a technique used by some high-end customers to
lower latency and improve performance. The read-behind-write technique
means that once the writer starts to write, the reader will
immediately trail behind to read; the idea is to overlap the write
time with read time. This concept is beneficial on machines with slow
I/O performance. For a high I/O throughput machine such as pSeries
690, it may be worth considering first writing the entire file out in
parallel and then reading the data back in parallel.
There are many ways that read-behind-write can be implemented. In the
scheme implemented by Xdd, after the writer writes one record, it will
wait for the reader to read that record before the writer can proceed.
Although this scheme keeps the writer and reader in sync just one
record apart, it takes system time to do the locking and
synchronization between writer and reader.
If one does not care about how many records that a reader lags behind
the writer, then one can implement a scheme for the writer to stream
down the writes as fast as possible. The writer can update a global
variable after a certain number of records are written. The reader can
then pull the global variable to find out how many records it has to
read.
如果没有通信渠道,您几乎只能继续尝试,可能会在出现多个零字节 read()
结果后调用 sleep()
或类似的东西。
我有两个进程,一个正在写入(追加)文件,另一个正在从文件中读取。两个进程同时 运行,但不通信。另一个 reader 进程可能会在编写器进程完成之前启动。
这种方法有效,但 read() 通常 returns 读取零字节且没有错误。它们的零长度读取与非零长度读取的比率很高,效率低下。
有什么办法解决这个问题吗?这是在 POSIX 文件系统上。
如果没有通信通道,则无法保证在读取正在写入的文件时防止零字节读取甚至长时间挂起而不读取任何数据。 tail
的 Linux 实现使用 inotify
有效地创建通信通道并获取有关文件写入的信息 activity.
这是一个非常有趣的问题,IBM 甚至 published a Redbook 描述了一个能够在大约 15 GB/sec:
时做到这一点的实现 "read-behind-write"Read-behind-write is a technique used by some high-end customers to lower latency and improve performance. The read-behind-write technique means that once the writer starts to write, the reader will immediately trail behind to read; the idea is to overlap the write time with read time. This concept is beneficial on machines with slow I/O performance. For a high I/O throughput machine such as pSeries 690, it may be worth considering first writing the entire file out in parallel and then reading the data back in parallel.
There are many ways that read-behind-write can be implemented. In the scheme implemented by Xdd, after the writer writes one record, it will wait for the reader to read that record before the writer can proceed. Although this scheme keeps the writer and reader in sync just one record apart, it takes system time to do the locking and synchronization between writer and reader.
If one does not care about how many records that a reader lags behind the writer, then one can implement a scheme for the writer to stream down the writes as fast as possible. The writer can update a global variable after a certain number of records are written. The reader can then pull the global variable to find out how many records it has to read.
如果没有通信渠道,您几乎只能继续尝试,可能会在出现多个零字节 read()
结果后调用 sleep()
或类似的东西。