C 中的内存泄漏,很可能是由于 realloc

Memory leak in C, most likely due to realloc

我有一个程序应该接收输入,记住最长的字符串并在 EOF 处打印出来。我的代码有效,但是当 运行 通过调试器检测到内存泄漏时。我在 Windows 中编译,没有像 Valgrind 这样的合适的调试器,所以我没有得到太多关于错误的信息。我能想到的唯一会导致此泄漏的是 realloc() 或 free() 函数。但是,我对C语言不够熟练,无法理解问题所在。

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>

int main(void)
{
    char *p;
    char *line;
    int sc;
    p = (char*) malloc ( sizeof(char) );
    line = (char*) malloc ( sizeof(char) );
    int count = 0;
    int max = 0;
    p[count] = 0;
    while ( ( sc = getchar()) != EOF ) {
        if ( p == NULL ) {
            p = (char*) realloc ( p, sizeof(char) );
        }
        if ( isalpha(sc) ) {
            p[count] = sc;
            count++;
            p = (char*) realloc( p, (count+1)*sizeof(char) );
            p[count] = 0;
        } else if ( sc == '\n' || sc == ' ' ) {
            if ( count > max ) {
                line = (char*) realloc( line, (count+1)*sizeof(char) );
                strcpy( line, p );
                max = count;
            } else if ( count == 0) {
                printf("%d characters in longest word: %s\n", max, line);
                free(line);
                free(p);
                break;
            }
            count = 0;
        }
    }
    return 0;
}

我认为在 free() 函数调用之后的调试过程中;你检查了指针 pline 的值,你仍然看到它指向的值(字符串)。如果是这样,那不是内存泄漏,因为 free() 不会更改指针的值或在其中分配 0'[=16=]' 字符。 free() 只需释放内存块,以便下次调用任何内存分配函数时,它将获得该内存作为可用内存,并进行分配。因此在调用free()之后,我们总是将NULL赋值给p = NULL;.

这样的指针

修改如下代码并尝试:-

free(line);
free(p);
line = NULL;
p = NULL;
break;

根据您的编码方法,我将使用 calloc(您需要)而不是 malloc 来摆脱它:

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>

int main(void)
{
    char *p = NULL;
    char *line = NULL;
    int sc;
    p = calloc ( sizeof(char), 1);
    line = calloc ( sizeof(char),1 );
    size_t count = 0;
    size_t max = 0;
    p[count] = 0;
    while ( ( sc = getchar()) != EOF ) {
        if ( p == NULL ) {
            p = realloc ( p, sizeof(char) );
        }
        if ( isalpha(sc) ) {
            p[count] = (char)sc;
            count++;
            p = realloc( p, (count+1) * sizeof(char) );
            p[count] = 0;
        } else if ( sc == '\n' || sc == ' ' ) {
            if ( count > max ) {
                line = realloc( line, (count+1)*sizeof(char) );
                strcpy( line, p );
                max = count;
            } else if ( count == 0) {
                printf("%zu characters in longest word: %s\n", max, line);
                free(line);
                free(p);
                break;
            }
            count = 0;
        }
    }
    return 0;
}

Valgrind 输出:

==4362== Memcheck, a memory error detector
==4362== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==4362== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==4362== Command: ./program
==4362== 
Hello World

5 characters in longest word: Hello
==4362== 
==4362== HEAP SUMMARY:
==4362==     in use at exit: 0 bytes in 0 blocks
==4362==   total heap usage: 15 allocs, 15 frees, 2,096 bytes allocated
==4362== 
==4362== All heap blocks were freed -- no leaks are possible
==4362== 
==4362== For counts of detected and suppressed errors, rerun with: -v
==4362== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

现在尝试理解,为什么不使用 malloc,为什么使用 calloc

编辑

你确定这都是关于泄漏的吗?并且与未初始化的值无关? 我说是因为如果你输入一个字母可以正常工作,问题出在数字上,因此 calloc 建议。

这是 Valgrind 的输出,使用您的代码未做任何修改:

==5042== Memcheck, a memory error detector
==5042== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==5042== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==5042== Command: ./program
==5042== 
1   
==5042== Conditional jump or move depends on uninitialised value(s)
==5042==    at 0x4E88CC0: vfprintf (vfprintf.c:1632)
==5042==    by 0x4E8F898: printf (printf.c:33)
==5042==    by 0x40082B: main (in /home/michael/program)
==5042==  Uninitialised value was created by a heap allocation
==5042==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5042==    by 0x400715: main (in /home/michael/program)
==5042== 
0 characters in longest word: 
==5042== 
==5042== HEAP SUMMARY:
==5042==     in use at exit: 0 bytes in 0 blocks
==5042==   total heap usage: 4 allocs, 4 frees, 2,050 bytes allocated
==5042== 
==5042== All heap blocks were freed -- no leaks are possible
==5042== 
==5042== For counts of detected and suppressed errors, rerun with: -v
==5042== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

你说 你 "have tried moving both free() to the end of the program",但我不相信你。问题似乎是这是一个设计糟糕的程序,它并不总是到达 free() 语句,因为当 sc 是 space 或换行符时, count 不太可能是0,并且在 count 重置为 0 后,下一个字符被读入 sc(不太可能是 space 或换行符)。

只需将对 free() 的调用移动到程序末尾即可修复内存泄漏,Valgrind 已报告此问题:

λ> valgrind --tool=memcheck ./a.out
==2967== Memcheck, a memory error detector
==2967== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==2967== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==2967== Command: ./a.out
==2967== 
this is a test
==2967== 
==2967== HEAP SUMMARY:
==2967==     in use at exit: 10 bytes in 2 blocks
==2967==   total heap usage: 14 allocs, 12 frees, 42 bytes allocated
==2967== 
==2967== LEAK SUMMARY:
==2967==    definitely lost: 10 bytes in 2 blocks
==2967==    indirectly lost: 0 bytes in 0 blocks
==2967==      possibly lost: 0 bytes in 0 blocks
==2967==    still reachable: 0 bytes in 0 blocks
==2967==         suppressed: 0 bytes in 0 blocks
==2967== Rerun with --leak-check=full to see details of leaked memory
==2967== 
==2967== For counts of detected and suppressed errors, rerun with: -v
==2967== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

但是代码还有很多其他问题。首先,永远不要将调用 realloc() 的结果存储在指向被重新分配的内存的唯一指针副本中;可能会返回一个空指针,如果是这样,您就会发生内存泄漏和数据丢失。相反,使用一个临时变量来保存结果,并且仅在检查空指针后才将此值分配给原始指针。如果返回了一个空指针,最简单的解决方案可能是用错误消息终止程序,如下所示;错误可以通过其他方式处理,只要它被处理。

发布的代码比需要的更复杂,多次调用 malloc()realloc()。相反,将 pline 初始化为 NULL,并仅在需要时重新分配。没有必要从为 1 char 分配 space 开始;这可以在需要存储第一个字符时完成。此外,不需要在 C 中转换 malloc() 的结果(C++ 中的情况不同);而且,sizeof char 始终为 1,因此这是多余的,只会使代码混乱。

发布的代码中的根本问题似乎是,当读取一个字符时,count 会递增。那么如果这个字符是space或者换行符,count可能不是0,所以可能不满足释放条件退出。与其复杂的条件,不如重新思考程序的流程。

读取一个字符后(除非遇到EOF),如果该字符是字母,则count应该递增,p应该重新分配。如果此步骤成功,字符应存储在 p[] 中,然后应以 null 终止。

否则,如果字符是 \n 或 space,则 max 应与 count 进行比较。如果 count 较大,则应重新分配 line。如果这一步成功,p指向的字符串应该被复制到line[]。然后max被赋予count的值,count被重置为0。

循环终止后,只有当输入中有单词时才会打印结果。然后可以在程序终止之前进行释放。

isalpha() 函数和 ctype.h 中的类似函数期望 int 值在 unsigned char(或 EOF)范围内。通常您需要将这些函数的参数值转换为 unsigned char 以避免未定义的行为。但是,在这种情况下,转换是不必要的,因为 getchar() returns int 值在 unsigned char(或 EOF)范围内。

您也可以考虑使用 isspace() 函数代替 sc == '\n' || sc == ' '。这将允许其他白色 space 字符,例如 '\t',来分隔输入中的单词。如 OP 中所写,输入 "one\tword"(其中 '\t' 是制表符)将导致输出:

7 characters in longest word: oneword

这里是发布代码的修改版本:

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>

int main(void)
{
    char *p = NULL;
    char *line = NULL;
    int sc;
    int count = 0;
    int max = 0;

    while ((sc = getchar()) != EOF) {
        if (isalpha(sc)) {
            ++count;                                    // read a letter
            char *temp = realloc(p, count + 1);         // +1 for '[=12=]'
            if (temp == NULL) {                         // check allocation
                perror("Failure to reallocate p");
                exit(EXIT_FAILURE);
            }
            p = temp;                                   // OK to reassign p
            p[count-1] = sc;                            // store character
            p[count] = 0;                               // add null-terminator
        } else if (isspace(sc)) {
            if (count > max) {
                char *temp = realloc(line, count + 1);  // +1 for '[=12=]'
                if (temp == NULL) {                     // check allocation
                    perror("Failure to reallocate line");
                    exit(EXIT_FAILURE);
                }
                line = temp;                            // OK to reassign line
                strcpy(line, p);
                max = count;
            }
            count = 0;
        }
    }

    if (max > 0) {
        printf("%d characters in longest word: %s\n", max, line);        
    } else {
        puts("No words in input");
    }

    free(line);
    free(p);

    return 0;
}

这是来自 Valgrind 的健康证明:

λ> valgrind --tool=memcheck ./a.out
==4753== Memcheck, a memory error detector
==4753== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==4753== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==4753== Command: ./a.out
==4753== 
this is a   testrun
7 characters in longest word: testrun
==4753== 
==4753== HEAP SUMMARY:
==4753==     in use at exit: 0 bytes in 0 blocks
==4753==   total heap usage: 16 allocs, 16 frees, 69 bytes allocated
==4753== 
==4753== All heap blocks were freed -- no leaks are possible
==4753== 
==4753== For counts of detected and suppressed errors, rerun with: -v
==4753== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)