Linux 无需轮询的同步

Question

原则上我想要的很简单

两个可执行文件./read 和./write 分别从一个资源（比方说一个文件）读取和写入。使用 flock(2) 很容易防止在任意时间任意调用 ./read 和 ./write 之间的竞争条件。

要求 ./read 的每次调用都包含上一次调用的资源快照，如果当前资源与快照匹配，./read 应该等待（休眠）直到调用./write 的更改资源。

据我所知，每个程序的程序流程应该是这样的：

//read.c
obtain mutex0
  read resource
  is resource same as our snapshot?
    release mutex0 [1]
    sleep until ./write says to wake up [2]
    obtain mutex0
    read resource
  do something with resource
release mutex0

//write.c
obtain mutex0
  change resource in some way
  tell any sleeping ./read's to wake up
release mutex0

这种方法的主要问题是标记为 [1] 和 [2] 的行之间存在明显的延迟。这意味着 ./read 可以在 [1] 释放 mutex0，./write 的整个调用可以完成，然后 [2] 执行，但会无限期地停止，因为 ./write 之前已经尝试唤醒所有睡眠中的 ./reads。

除了使用整个单独的成熟服务器进程之外，没有简单的方法来做我想做的事吗？另外，对于那些好奇的人，我想为 CGI 中的应用程序做这个。

Answer 1

不，reader 的程序流程不正确。您需要某种锁定机制来防止在一个或多个读取过程中进行写入，并且需要某种唤醒机制来在写入完成时通知 readers。

作者的程序流程没问题：

    # Initial read of file contents
    Obtain lock
        Read file
    Release lock

    # Whenever wishes to modify file:
    Obtain lock
        Modify file
        Signal readers
    Release lock

reader(s) 的程序流程应该是：

    # Initial read of file contents
    Obtain lock
        Read file
    Release lock

    # Wait and respond to changes in file
    On signal:
        Obtain lock
            Read file
        Release lock    
        Do something with modified file contents

如果只有一个reader，则互斥(pthread_mutex_t) in shared memory (accessible to all writers and the reader) suffices; otherwise, I recommend using an rwlock (pthread_rwlock_t) instead. For waking up any waiting readers, broadcast on a condition variable (pthread_cond_t)。当然，困难在于设置共享内存。

咨询文件锁定和fanotify接口也足够了。读者安装一个fanotify FAN_MODIFY 标记，只需等待相应的事件即可。写入者不需要合作，除了使用咨询锁（它的存在只是为了在文件被修改时阻止 readers 读取）。

不幸的是，该接口当前需要 CAP_SYS_ADMIN 功能，您绝对不希望随机 CGI 程序具有该功能。

咨询文件锁定和 inotify 接口就足够了，我相信最适合这个，当 readers 和编写器都为每组操作打开和关闭文件时。 reader(s) 这种情况下的程序流程是：

Initialize inotify interface
Add inotify watch for IN_CREATE and IN_CLOSE_WRITE for "file"

Open "file" read-only
    Obtain shared/read-lock
        Read contents
    Release lock
Close "file"

Loop:
    Read events from inotify descriptor.
    If IN_CREATE or IN_CLOSE_WRITE for "file":
        Open "file" read-only
            Obtain shared/read-lock
                Read contents
            Release lock
        Close "file"
        Do something with file contents

作者还只是

    # Initial read of file contents
    Open "file" for read-only
        Obtain shared/read-lock on "file"
            Read contents
        Release lock
    Close "file"

    # Whenever wishes to modify file:
    Open "file" for read-write
        Obtain exclusive/write-lock
            Modify file
        Release lock
    Close "file"

即使写入者没有获得锁，写入者关闭文件时也会通知reader；唯一的风险是在 reader 正在读取文件时写入另一组更改（由另一个锁拒绝修饰符）。

即使修改器用新文件替换文件，reader 也会在新文件准备就绪时得到正确通知（旧文件之上的 renamed/linked，或者新文件文件创建者关闭文件）。重要的是要注意，如果 readers 保持文件打开，他们的文件描述符将不会神奇地跳转到新文件，他们只会看到旧的（可能已删除的）内容。

如果由于某些重要原因 readers 和编写者不关闭文件，readers 仍然可以使用 inotify，但会使用 IN_MODIFY 标记，以每当文件被截断或写入时都会收到通知。在这种情况下，重要的是要记住，如果文件随后被替换（重命名，或删除并重新创建），readers 和 writers 将看不到新文件，但将对旧文件进行操作，现在在文件系统中不可见的文件内容。

reader的程序流程：

Initialize inotify interface
Add inotify watch for IN_MODIFY for "file"

Open "file" read-only
    Obtain shared/read-lock
        Read contents
    Release lock

    Loop:
        Read events from inotify descriptor.
        If IN_CREATE or IN_CLOSE_WRITE for "file":
            Obtain shared/read-lock on "file"
                Read contents
            Release lock
            Do something with file contents

编剧的程序流程还是差不多的：

    # Initial read of file contents
    Open "file" for read-only
        Obtain shared/read-lock on "file"
            Read contents
        Release lock
    Close "file"

    Open "file" for read-write

    # Whenever writer wishes to modify the file:
    Obtain exclusive/write-lock
        Modify file
    Release lock

注意 inotify 事件发生在事后可能很重要。通常会有一些小的延迟，这可能取决于机器上的负载。因此，如果对文件更改的及时响应对于系统正常工作很重要，则您可能必须改用互斥锁或 rwlock 以及共享内存方法中的条件变量。

根据我的经验，这些延迟往往比典型的人类反应间隔要短。因此，我认为——我也建议你这样做——inotify 接口在人类时间尺度上足够快速和可靠；在毫秒和亚毫秒机器时间尺度上并非如此。

Linux 无需轮询的同步

Linux synchronization without polling

c

linux

synchronization

asynchronous

glibc