如何在 epoll 上使用具有级别触发行为的 eventfd?
How to use an eventfd with level triggered behaviour on epoll?
在 epoll_ctl
上注册一个级别触发的 eventfd 仅在不递减 eventfd 计数器时触发一次。总结一下这个问题,我观察到 epoll 标志(EPOLLET
、EPOLLONESHOT
或 None
用于级别触发行为)表现相似。或者换句话说:没有效果。
你能确认这个错误吗?
我有一个多线程应用程序。每个线程都用相同的 epollfd 等待 epoll_wait
的新事件。如果要优雅地终止应用程序,则必须唤醒所有线程。我的想法是你为此使用 eventfd 计数器(EFD_SEMAPHORE|EFD_NONBLOCK
)(具有级别触发的 epoll 行为)一起唤醒。 (不考虑少数文件描述符的雷群问题。)
例如对于 4 个线程,您将 4 写入 eventfd。我期待 epoll_wait
returns 一次又一次,直到计数器递减(读取)4 次。 epoll_wait
每次写入仅 returns 一次。
是的,我仔细阅读了所有相关手册;)
#include <sys/epoll.h>
#include <sys/eventfd.h>
#include <sys/types.h>
#include <unistd.h>
#include <pthread.h>
static int event_fd = -1;
static int epoll_fd = -1;
void *thread(void *arg)
{
(void) arg;
for(;;) {
struct epoll_event event;
epoll_wait(epoll_fd, &event, 1, -1);
/* handle events */
if(event.data.fd == event_fd && event.events & EPOLLIN) {
uint64_t val = 0;
eventfd_read(event_fd, &val);
break;
}
}
return NULL;
}
int main(void)
{
epoll_fd = epoll_create1(0);
event_fd = eventfd(0, EFD_SEMAPHORE| EFD_NONBLOCK);
struct epoll_event event;
event.events = EPOLLIN;
event.data.fd = event_fd;
epoll_ctl(epoll_fd, EPOLL_CTL_ADD, event_fd, &event);
enum { THREADS = 4 };
pthread_t thrd[THREADS];
for (int i = 0; i < THREADS; i++)
pthread_create(&thrd[i], NULL, &thread, NULL);
/* let threads park internally (kernel does readiness check before sleeping) */
usleep(100000);
eventfd_write(event_fd, THREADS);
for (int i = 0; i < THREADS; i++)
pthread_join(thrd[i], NULL);
}
当您写入 eventfd
时,将调用函数 eventfd_signal
。它包含以下唤醒功能的行:
wake_up_locked_poll(&ctx->wqh, EPOLLIN);
wake_up_locked_poll
是一个宏:
#define wake_up_locked_poll(x, m) \
__wake_up_locked_key((x), TASK_NORMAL, poll_to_key(m))
__wake_up_locked_key
定义为:
void __wake_up_locked_key(struct wait_queue_head *wq_head, unsigned int mode, void *key)
{
__wake_up_common(wq_head, mode, 1, 0, key, NULL);
}
最后,__wake_up_common
声明为:
/*
* The core wakeup function. Non-exclusive wakeups (nr_exclusive == 0) just
* wake everything up. If it's an exclusive wakeup (nr_exclusive == small +ve
* number) then we wake all the non-exclusive tasks and one exclusive task.
*
* There are circumstances in which we can try to wake a task which has already
* started to run but is not in state TASK_RUNNING. try_to_wake_up() returns
* zero in this (rare) case, and we handle it by continuing to scan the queue.
*/
static int __wake_up_common(struct wait_queue_head *wq_head, unsigned int mode,
int nr_exclusive, int wake_flags, void *key,
wait_queue_entry_t *bookmark)
注意 nr_exclusive
参数,您会看到写入 eventfd
只会唤醒一个独占服务员。
独家是什么意思?阅读 epoll_ctl
手册页给我们一些见解:
EPOLLEXCLUSIVE (since Linux 4.5):
Sets an exclusive wakeup mode for the epoll file descriptor that is being attached to the target file descriptor, fd. When a wakeup event occurs and multiple epoll file descriptors are attached to the same target file using EPOLLEXCLUSIVE
, one or more of the epoll file descriptors will receive an event with epoll_wait(2)
.
添加事件时不使用 EPOLLEXCLUSIVE
,但要使用 epoll_wait
等待,每个线程都必须将自己放入等待队列。函数 do_epoll_wait
performs the wait by calling ep_poll
. By following the code you can see that it adds the current thread to a wait queue at line #1903:
__add_wait_queue_exclusive(&ep->wq, &wait);
这是对正在发生的事情的解释 - epoll 服务员是 独占的,所以只有一个线程被唤醒。此行为已在 v2.6.22-rc1 and the relevant change has been discussed here.
中引入
对我来说,这看起来像是 eventfd_signal
函数中的错误:在信号量模式下,它应该执行唤醒 nr_exclusive
等于写入的值。
所以你的选择是:
- 为每个线程创建一个单独的 epoll 描述符(可能不适用于您的设计 - 缩放问题)
- 在它周围放置一个互斥量(缩放问题)
- 使用
poll
,可能在eventfd
和epoll 上
- 通过将 1 写入
evenfd_write
4 次(可能是您能做的最好的)来分别唤醒每个线程。
在 epoll_ctl
上注册一个级别触发的 eventfd 仅在不递减 eventfd 计数器时触发一次。总结一下这个问题,我观察到 epoll 标志(EPOLLET
、EPOLLONESHOT
或 None
用于级别触发行为)表现相似。或者换句话说:没有效果。
你能确认这个错误吗?
我有一个多线程应用程序。每个线程都用相同的 epollfd 等待 epoll_wait
的新事件。如果要优雅地终止应用程序,则必须唤醒所有线程。我的想法是你为此使用 eventfd 计数器(EFD_SEMAPHORE|EFD_NONBLOCK
)(具有级别触发的 epoll 行为)一起唤醒。 (不考虑少数文件描述符的雷群问题。)
例如对于 4 个线程,您将 4 写入 eventfd。我期待 epoll_wait
returns 一次又一次,直到计数器递减(读取)4 次。 epoll_wait
每次写入仅 returns 一次。
是的,我仔细阅读了所有相关手册;)
#include <sys/epoll.h>
#include <sys/eventfd.h>
#include <sys/types.h>
#include <unistd.h>
#include <pthread.h>
static int event_fd = -1;
static int epoll_fd = -1;
void *thread(void *arg)
{
(void) arg;
for(;;) {
struct epoll_event event;
epoll_wait(epoll_fd, &event, 1, -1);
/* handle events */
if(event.data.fd == event_fd && event.events & EPOLLIN) {
uint64_t val = 0;
eventfd_read(event_fd, &val);
break;
}
}
return NULL;
}
int main(void)
{
epoll_fd = epoll_create1(0);
event_fd = eventfd(0, EFD_SEMAPHORE| EFD_NONBLOCK);
struct epoll_event event;
event.events = EPOLLIN;
event.data.fd = event_fd;
epoll_ctl(epoll_fd, EPOLL_CTL_ADD, event_fd, &event);
enum { THREADS = 4 };
pthread_t thrd[THREADS];
for (int i = 0; i < THREADS; i++)
pthread_create(&thrd[i], NULL, &thread, NULL);
/* let threads park internally (kernel does readiness check before sleeping) */
usleep(100000);
eventfd_write(event_fd, THREADS);
for (int i = 0; i < THREADS; i++)
pthread_join(thrd[i], NULL);
}
当您写入 eventfd
时,将调用函数 eventfd_signal
。它包含以下唤醒功能的行:
wake_up_locked_poll(&ctx->wqh, EPOLLIN);
wake_up_locked_poll
是一个宏:
#define wake_up_locked_poll(x, m) \
__wake_up_locked_key((x), TASK_NORMAL, poll_to_key(m))
__wake_up_locked_key
定义为:
void __wake_up_locked_key(struct wait_queue_head *wq_head, unsigned int mode, void *key)
{
__wake_up_common(wq_head, mode, 1, 0, key, NULL);
}
最后,__wake_up_common
声明为:
/*
* The core wakeup function. Non-exclusive wakeups (nr_exclusive == 0) just
* wake everything up. If it's an exclusive wakeup (nr_exclusive == small +ve
* number) then we wake all the non-exclusive tasks and one exclusive task.
*
* There are circumstances in which we can try to wake a task which has already
* started to run but is not in state TASK_RUNNING. try_to_wake_up() returns
* zero in this (rare) case, and we handle it by continuing to scan the queue.
*/
static int __wake_up_common(struct wait_queue_head *wq_head, unsigned int mode,
int nr_exclusive, int wake_flags, void *key,
wait_queue_entry_t *bookmark)
注意 nr_exclusive
参数,您会看到写入 eventfd
只会唤醒一个独占服务员。
独家是什么意思?阅读 epoll_ctl
手册页给我们一些见解:
EPOLLEXCLUSIVE (since Linux 4.5):
Sets an exclusive wakeup mode for the epoll file descriptor that is being attached to the target file descriptor, fd. When a wakeup event occurs and multiple epoll file descriptors are attached to the same target file using
EPOLLEXCLUSIVE
, one or more of the epoll file descriptors will receive an event withepoll_wait(2)
.
添加事件时不使用 EPOLLEXCLUSIVE
,但要使用 epoll_wait
等待,每个线程都必须将自己放入等待队列。函数 do_epoll_wait
performs the wait by calling ep_poll
. By following the code you can see that it adds the current thread to a wait queue at line #1903:
__add_wait_queue_exclusive(&ep->wq, &wait);
这是对正在发生的事情的解释 - epoll 服务员是 独占的,所以只有一个线程被唤醒。此行为已在 v2.6.22-rc1 and the relevant change has been discussed here.
中引入对我来说,这看起来像是 eventfd_signal
函数中的错误:在信号量模式下,它应该执行唤醒 nr_exclusive
等于写入的值。
所以你的选择是:
- 为每个线程创建一个单独的 epoll 描述符(可能不适用于您的设计 - 缩放问题)
- 在它周围放置一个互斥量(缩放问题)
- 使用
poll
,可能在eventfd
和epoll 上
- 通过将 1 写入
evenfd_write
4 次(可能是您能做的最好的)来分别唤醒每个线程。