Helgrind 报告单线程数据竞争

Helgrind reports data race on single thread

当试图创建一个只读取和打印自己的参数然后 returns 的单个线程时,helgrind 发现了很多可能的数据竞争,尽管主线程执行 pthread_join 一旦创建了新线程。

这里是线程初始化(一个缩小版本,仍然重现问题):

void liveness(cfg_t* cfg)
{
    vertex_t*               u;
    size_t                  i;
    size_t*                 arg;
    pthread_t               thread;
    pthread_mutex_t*        lock;

    lock = (pthread_mutex_t*) malloc(sizeof(pthread_mutex_t));
    if (lock == NULL) {
        printf("Error when allocating memory for locks");
    }
    if (pthread_mutex_init(lock, NULL) != 0) {
        printf("Error when creating lock\n");
    }

    arg = malloc(sizeof(size_t));
    (*arg) = 0;
    if (pthread_create(&thread, NULL, thread_start, arg)) {
        perror("Error when creating thread\n");
        exit(1);
    }
    if (pthread_join(thread, NULL)) {               
        perror("Error when joining thread\n");
        exit(1);
    }
    free(lock);
    free(arg); //244
}

这是thread_start

void* thread_start(void* arguments)
{
    size_t          index;
    index = * (size_t*) arguments; /155
    printf("Thread started! Index %zu\n", index);
    fflush(stdout);
    return NULL;
}

输出正确(线程已启动!索引 0)但 helgrind 产生以下输出

==3489== Possible data race during write of size 8 at 0x4003330 by thread #1
==3489== Locks held: none
==3489==    at 0x42970F: _int_free (in /h/d9/b/dat11ote/courses/edan25/lab4home/live)
==3489==    by 0x402D5C: liveness (paralleldataflow.c:244)
==3489==    by 0x401E4F: main (main.c:134)
==3489==
==3489== This conflicts with a previous read of size 8 by thread #2
==3489== Locks held: none
==3489==    at 0x402C4C: thread_start (paralleldataflow.c:155)
==3489==    by 0x4040B1: start_thread (pthread_create.c:312)
==3489==    by 0x4500E8: clone (in /h/d9/b/dat11ote/courses/edan25/lab4home/live)

以及来自 25 个上下文的 30 个错误。如果我将 return 语句更改为在线程参数之前,如

void* thread_start(void* arguments)
{
    size_t          index;
    return NULL;
}

然后一切正常。我将 -pthreads 和 -static 标志用于 gcc。如果我删除 printf 和 fflush,这会留下上面的错误,但会删除所有其他错误,看起来像:

Possible data race during write of size 8 at 0x6D7878 by thread #1
Locks held: none
at 0x40F449: vfprintf (in /h/../live)
by 0x419075: printf (in /h/../live)
by 0x401E76: main (main.c:137)
This conflicts with a previous write of size 8 by thread #2
Locks held: none
at 0x40F449: vfprintf (in /h/../live)
by 0x419075: printf (in /h/../live)
by 0x402C68: thread_start (in /h/../live)
by 0x404061: start_thread (pthread_create.c:312)
by 0x44B2A8: clone (in /h/../live)

如果对link使用-static,则表示valgrind/helgrind 不能替换或包装一组必须 replaced/wrapped 的函数 让 helgrind 正常工作。

通常,要让 helgrind 正常工作,malloc/free/... 必须更换。 pthread_create/pthread_join/...等函数必须由helgrind包装。

使用静态库意味着这些函数不会被替换或包装, 导致大量误报。