大文件双重释放或损坏 (fasttop)

double free or corruption (fasttop) with large files

我正在开发一个用于读取大型 CSV 文件的程序。我已经使用较小的 CSV 文件开发和测试它以进行调试并且它有效。但是当我使用真实的(17k 行)时,它开始不起作用。

这是有问题的函数(包含所有文件):

#include "split_string.h"

void _add_to_tab(char** ret, char* string, unsigned int len)
{
    ret = realloc(ret, (len + 1)*sizeof(char*));
    ret[len] = string;
}

char** st_split(char* source, const char* delimiter)
{
    unsigned int len = 0;
    char** ret = NULL;
    char* tmp = NULL;

    ret = malloc(1 * sizeof(char**));
    if(ret==NULL)
    {
        return NULL;
    }
    else
    {
        ret[0] = source;
        tmp = strtok(source, delimiter);
        while(tmp!=NULL)
        {
            _add_to_tab(ret, tmp, len);
            len++;
            tmp = strtok(NULL, delimiter);
        }

        return ret;
    }
}

我做了一个测试,批评点在第 1202 行,如果我的 CSV 得到更多行,程序 returns 我会出现以下错误:

*** Error in `test.x': double free or corruption (fasttop): 0x00000000018adfc0 ***

我用 Valgrind 做到了:

valgrind --leak-check=yes test.x

它 returns 我这个 :

==2132== Memcheck, a memory error detector
==2132== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==2132== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==2132== Command: test.x
==2132== 
0 3 -1 -1 -1 -1 -1 -1 -1 -1 -1
==2132== Invalid free() / delete / delete[] / realloc()
==2132==    at 0x4C2AF2E: realloc (vg_replace_malloc.c:692)
==2132==    by 0x4011B3: _add_to_tab (split_string.c:5)
==2132==    by 0x40124F: st_split (split_string.c:26)
==2132==    by 0x40109E: lecture_fichier (lecture_fichier.c:108)
==2132==    by 0x40134A: main (test.c:11)
==2132==  Address 0x54e12c0 is 0 bytes inside a block of size 8 free'd
==2132==    at 0x4C2AF2E: realloc (vg_replace_malloc.c:692)
==2132==    by 0x4011B3: _add_to_tab (split_string.c:5)
==2132==    by 0x40124F: st_split (split_string.c:26)
==2132==    by 0x40109E: lecture_fichier (lecture_fichier.c:108)
==2132==    by 0x40134A: main (test.c:11)
==2132== 
==2132== Invalid write of size 8
==2132==    at 0x4011CE: _add_to_tab (split_string.c:6)
==2132==    by 0x40124F: st_split (split_string.c:26)
==2132==    by 0x40109E: lecture_fichier (lecture_fichier.c:108)
==2132==    by 0x40134A: main (test.c:11)
==2132==  Address 0x8 is not stack'd, malloc'd or (recently) free'd
==2132== 
==2132== 
==2132== Process terminating with default action of signal 11 (SIGSEGV)
==2132==  Access not within mapped region at address 0x8
==2132==    at 0x4011CE: _add_to_tab (split_string.c:6)
==2132==    by 0x40124F: st_split (split_string.c:26)
==2132==    by 0x40109E: lecture_fichier (lecture_fichier.c:108)
==2132==    by 0x40134A: main (test.c:11)
==2132==  If you believe this happened as a result of a stack
==2132==  overflow in your program's main thread (unlikely but
==2132==  possible), you can try to increase the size of the
==2132==  main thread stack using the --main-stacksize= flag.
==2132==  The main thread stack size used in this run was 8388608.
==2132== 
==2132== HEAP SUMMARY:
==2132==     in use at exit: 576 bytes in 2 blocks
==2132==   total heap usage: 4 allocs, 2 frees, 600 bytes allocated
==2132== 
==2132== 8 bytes in 1 blocks are definitely lost in loss record 1 of 2
==2132==    at 0x4C2AF2E: realloc (vg_replace_malloc.c:692)
==2132==    by 0x4011B3: _add_to_tab (split_string.c:5)
==2132==    by 0x40124F: st_split (split_string.c:26)
==2132==    by 0x40109E: lecture_fichier (lecture_fichier.c:108)
==2132==    by 0x40134A: main (test.c:11)
==2132== 
==2132== LEAK SUMMARY:
==2132==    definitely lost: 8 bytes in 1 blocks
==2132==    indirectly lost: 0 bytes in 0 blocks
==2132==      possibly lost: 0 bytes in 0 blocks
==2132==    still reachable: 568 bytes in 1 blocks
==2132==         suppressed: 0 bytes in 0 blocks
==2132== Reachable blocks (those to which a pointer was found) are not shown.
==2132== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==2132== 
==2132== For counts of detected and suppressed errors, rerun with: -v
==2132== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)

问题就在这里

void _add_to_tab(char** ret, char* string, unsigned int len)
{
    ret = realloc(ret, (len + 1)*sizeof(char*));
    ret[len] = string;
}

当重新分配 return 一个不同的指针时,只有本地 ret 被改变但没有传递给调用者,所以调用者仍然使用以前的,现在无效的指针。

改为

char **_add_to_tab(char** ret, char* string, unsigned int len)
{
    ret = realloc(ret, (len + 1)*sizeof(char*));
    ret[len] = string;
    return ret;
}

并将其命名为

ret = _add_to_tab(ret, tmp, len);

并且不要忘记添加一些错误检查(realloc 可能 return NULL)