在多个线程上使用 pthread_join() 会产生意外行为

Question

我正在学习如何在 C 中使用线程，但在创建线程时遇到了运行问题。我正在制作一个程序，它接受 2 个或更多文件名作为命令行参数，计算每个文件在各自线程中的字节数，然后输出最大文件的名称。当我在创建线程后直接使用 pthread_join() 时，程序会按预期运行运行s。但是，我知道这不是应该使用线程的方式，因为它违背了目的。当我在创建所有线程后在 for 循环中使用 pthread_join() 时，程序无法正常运行。谁能告诉我我做错了什么？感谢所有帮助。这是我的主要功能。

pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; //mutex for changing max_bytes and max_name
int max_bytes = 0;
char max_name[100];

struct arg_struct{ //struct to hold args to pass the threads
        int fd;
        char name[100];
};

int main(int argc, char* argv[])
{
        if(argc < 3){ //checks for correct number of arguments passed
                perror("Wrong number of arguments");
                return EXIT_FAILURE;
        }

        int arg_num = argc - 1; //holds number of arguments passed

        pthread_t threadid[arg_num]; //array of thread IDs
        struct arg_struct args;
        for(int i = 0; i < arg_num; i++){
                args.fd = open(argv[i+1], O_RDONLY);
                memcpy(args.name, argv[i+1], sizeof(args.name)); //copies file name into arg_struct
                int thread_err = pthread_create(&threadid[i], NULL, count_bytes, (void*)&args); //create thread by calling count_bytes and passing it a struct of args
                //pthread_join(threadid[i], NULL);
                if(thread_err != 0){
                        perror("pthread_create failed");
                        return EXIT_FAILURE;
                }
        }

        for(int i = 0; i < arg_num; i++){
                pthread_join(threadid[i], NULL);
        }

        printf("%s is the largest of the submitted files\n", max_name);

        return 0;
}

这是线程运行宁的功能。

void *count_bytes(void* arguments)
{
        struct arg_struct *args = (struct arg_struct*)arguments; //casting arguments back to struct from void*
        int fd = args -> fd;
        char name[100];
        memcpy(name, args -> name, sizeof(name)); //copies file name into name from args.name
        int bytes = 0;

        int size = 10;
        char*  buffer = (char*) malloc(size);
        if(buffer == NULL){
                perror("malloc failed");
                exit(EXIT_FAILURE);
        }
        int buffer_count = 0;
        for(int i = 0; i < size; i++){
                buffer[i] = '[=12=]'; //sets all elements to '[=12=]' to determine end of file later
        }
        int read_return = read(fd, &buffer[buffer_count], 1);
        if(read_return == -1){
                perror("reading failed");
                exit(EXIT_FAILURE);
        }

        while(buffer[buffer_count] != '[=12=]'){
                bytes++;
                buffer_count++;
                buffer[buffer_count] = '[=12=]'; //sets previous element to '[=12=]' to determine end of file later
                if(buffer_count >= size){
                        buffer_count = 0; //buffer will hold up to 10 elements and then go back to the beginning
                }
                read_return = read(fd, &buffer[buffer_count], 1);
                if(read_return == -1){
                        perror("reading failed");
                        exit(EXIT_FAILURE);
                }
        }

        printf("%s has %d bytes\n", name, bytes);

        pthread_mutex_lock(&mutex);
        if(bytes > max_bytes){
                max_bytes = bytes;
                memcpy(max_name, name, sizeof(max_name));
        }
        //locks mutex to avoid race condition
        //then sets bytes to max_bytes if it is later than max_bytes
        //then locks mutex to allow another thread to have access
        pthread_mutex_unlock(&mutex);

        return NULL;
}

如果有用的话，这些是运行正确

时产生的两个输出

./a.out another buffered_readword.c
another has 8 bytes
buffered_readword.c has 3747 bytes
buffered_readword.c is the largest of the submitted files

不正确

./a.out another buffered_readword.c
buffered_readword.c has 1867 bytes
buffered_readword.c has 1881 bytes
buffered_readword.c is the largest of the submitted files

Answer 1

问题是只有一个args结构。在调用 pthread_create 之后，新线程可能不会立即运行。到线程运行时，它们很可能会看到相同的 args 值。在线程创建循环中调用 pthread_join 可以“修复”这个问题，因为它确保每个线程在 args 更新为下一个值之前完成。

要正确修复，请将不同的 args 传递给每个线程。说明性代码：

struct arg_struct args[arg_num];
for(int i = 0; i < arg_num; i++){
    args[i].fd = open(argv[i+1], O_RDONLY);
    memcpy(args[i].name, argv[i+1], sizeof(args[i].name));
    int thread_err = pthread_create(&threadid[i], NULL, count_bytes, &args[i]); 
    ....

在多个线程上使用 pthread_join() 会产生意外行为

Using pthread_join() on multiple threads giving unexpected behavior

c

multithreading

pthreads

pthread-join