C getline内存泄漏不同的行为
C getline memory leak different behaviours
我对函数 getline()
有疑问,正如 valgrind
所报告的那样,它在两种内存使用情况下的行为似乎有所不同。我post 两种情况的代码和行为解释。
我希望有人能给我指出正确的方向。
第一个案例
getline()
在 while 循环中调用,读取缓冲区中文本文件的所有行。然后缓冲区仅在循环结束时释放一次:在这种情况下 valgrind
没有给出错误(没有发生泄漏)。
int main(int argc, char* argv[])
{
char* buffer = NULL;
size_t bufsize = 0;
ssize_t nbytes;
int counter = 0;
char error = 0;
FILE* input_fd = fopen(argv[1], "r");
while ((nbytes = getline(&buffer, &bufsize, input_fd)) != -1)
{
counter += 1;
}
free(buffer);
fclose(input_fd);
return 0;
}
第二种情况
同一个循环调用一个函数,该函数又调用 getline()
,传递相同的缓冲区。同样,缓冲区仅在循环结束时释放一次,但在这种情况下 valgrind
报告内存泄漏。事实上,制作程序 运行 并查看 RSS,我可以看到它随着循环的进行而增加。请注意,在循环内添加一个 free(每个循环都释放缓冲区)问题就会消失。这是代码。
int my_getline(FILE* lf_fd, char** lf_buffer)
{
ssize_t lf_nbytes = 0;
size_t lf_bufsiz = 0;
lf_nbytes = getline(lf_buffer, &lf_bufsiz, lf_fd);
if (lf_nbytes == -1)
return 1;
return 0;
}
int main(int argc, char* argv[])
{
char* lf_buffer = NULL;
size_t bufsize = 0;
ssize_t nbytes;
int counter = 0;
int new_line_counter = 0;
char error = 0;
FILE* lf_fd = fopen(argv[1], "r");
while ((my_getline(lf_fd, &lf_buffer)) == 0)
{
// Added to allow measuring the RSS
sleep(2);
// If I uncomment this, no memory leak occurs
//free(lf_buffer);
}
free(lf_buffer);
fclose(lf_fd);
return 0;
}
Valgrind 输出
==9604== Memcheck, a memory error detector
==9604== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==9604== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==9604== Command: ./my_getline_x86 /media/sf_Scambio/processes.log
==9604== HEAP SUMMARY:
==9604== in use at exit: 1,194 bytes in 2 blocks
==9604== total heap usage: 8 allocs, 6 frees, 11,242 bytes allocated
==9604==
==9604== 1,194 bytes in 2 blocks are definitely lost in loss record 1 of 1
==9604== at 0x483DFAF: realloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-
linux.so)
==9604== by 0x48E371D: getdelim (iogetdelim.c:102)
==9604== by 0x1092B3: my_getline (my_getline.c:14)
==9604== by 0x10956A: main (my_getline.c:38)
==9604==
==9604== LEAK SUMMARY:
==9604== definitely lost: 1,194 bytes in 2 blocks
==9604== indirectly lost: 0 bytes in 0 blocks
==9604== possibly lost: 0 bytes in 0 blocks
==9604== still reachable: 0 bytes in 0 blocks
==9604== suppressed: 0 bytes in 0 blocks
==9604==
==9604== For lists of detected and suppressed errors, rerun with: -s
==9604== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
第一个程序没问题
第二个问题来自 getline()
的缓冲区长度参数。您的 my_getline()
总是将其设置为 0,这意味着 getline()
每次都分配一个新缓冲区(至少,对于您正在使用的 glibc 实现;见下文)。改成
int my_getline(FILE* lf_fd, char** lf_buffer, size_t* lf_bufsiz)
{
ssize_t lf_nbytes = 0;
lf_nbytes = getline(lf_buffer, lf_bufsiz, lf_fd);
if (lf_nbytes == -1)
return 1;
return 0;
}
并在使用时传递一个指向最初初始化为0的size_t
变量的指针。 main()
中的现有 bufsize
变量看起来适合使用:
//...
while ((my_getline(lf_fd, &lf_buffer, &bufsize)) == 0)
// ...
虽然解决起来很容易,但您遇到的内存泄漏似乎是 getline()
的 glibc 实现中的错误。
If *lineptr
is a null pointer or if the object pointed to by *lineptr
is of insufficient size, an object shall be allocated as if by malloc()
or the object shall be reallocated as if by realloc()
, respectively, such that the object is large enough to hold the characters to be written to it...
Alternatively, before calling getline()
, *lineptr
can contain a pointer to a malloc(3)
-allocated buffer *n
bytes in size. If the buffer is not large enough to hold the line, getline()
resizes it with realloc(3)
, updating *lineptr
and *n
as necessary.
这些表明,在您 运行 的情况下,您将一个有效的非 NULL
指针传递给内存并说它的长度为 0,该函数应该使用realloc()
调整大小。但是,glibc implementation checks *lineptr == NULL || *n == 0
and if true, overwrites *lineptr
with a newly allocated buffer, causing the leak you saw. Compare the NetBSD implementation 对所有分配使用 realloc()
(realloc(NULL, x)
等同于 malloc(x)
),因此不会导致原始代码泄漏。这并不理想,因为它会在每次使用时导致 realloc()
而不是仅在缓冲区不够大以容纳当前行时(与上面的固定版本不同),但它有效。
我对函数 getline()
有疑问,正如 valgrind
所报告的那样,它在两种内存使用情况下的行为似乎有所不同。我post 两种情况的代码和行为解释。
我希望有人能给我指出正确的方向。
第一个案例
getline()
在 while 循环中调用,读取缓冲区中文本文件的所有行。然后缓冲区仅在循环结束时释放一次:在这种情况下 valgrind
没有给出错误(没有发生泄漏)。
int main(int argc, char* argv[])
{
char* buffer = NULL;
size_t bufsize = 0;
ssize_t nbytes;
int counter = 0;
char error = 0;
FILE* input_fd = fopen(argv[1], "r");
while ((nbytes = getline(&buffer, &bufsize, input_fd)) != -1)
{
counter += 1;
}
free(buffer);
fclose(input_fd);
return 0;
}
第二种情况
同一个循环调用一个函数,该函数又调用 getline()
,传递相同的缓冲区。同样,缓冲区仅在循环结束时释放一次,但在这种情况下 valgrind
报告内存泄漏。事实上,制作程序 运行 并查看 RSS,我可以看到它随着循环的进行而增加。请注意,在循环内添加一个 free(每个循环都释放缓冲区)问题就会消失。这是代码。
int my_getline(FILE* lf_fd, char** lf_buffer)
{
ssize_t lf_nbytes = 0;
size_t lf_bufsiz = 0;
lf_nbytes = getline(lf_buffer, &lf_bufsiz, lf_fd);
if (lf_nbytes == -1)
return 1;
return 0;
}
int main(int argc, char* argv[])
{
char* lf_buffer = NULL;
size_t bufsize = 0;
ssize_t nbytes;
int counter = 0;
int new_line_counter = 0;
char error = 0;
FILE* lf_fd = fopen(argv[1], "r");
while ((my_getline(lf_fd, &lf_buffer)) == 0)
{
// Added to allow measuring the RSS
sleep(2);
// If I uncomment this, no memory leak occurs
//free(lf_buffer);
}
free(lf_buffer);
fclose(lf_fd);
return 0;
}
Valgrind 输出
==9604== Memcheck, a memory error detector
==9604== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==9604== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==9604== Command: ./my_getline_x86 /media/sf_Scambio/processes.log
==9604== HEAP SUMMARY:
==9604== in use at exit: 1,194 bytes in 2 blocks
==9604== total heap usage: 8 allocs, 6 frees, 11,242 bytes allocated
==9604==
==9604== 1,194 bytes in 2 blocks are definitely lost in loss record 1 of 1
==9604== at 0x483DFAF: realloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-
linux.so)
==9604== by 0x48E371D: getdelim (iogetdelim.c:102)
==9604== by 0x1092B3: my_getline (my_getline.c:14)
==9604== by 0x10956A: main (my_getline.c:38)
==9604==
==9604== LEAK SUMMARY:
==9604== definitely lost: 1,194 bytes in 2 blocks
==9604== indirectly lost: 0 bytes in 0 blocks
==9604== possibly lost: 0 bytes in 0 blocks
==9604== still reachable: 0 bytes in 0 blocks
==9604== suppressed: 0 bytes in 0 blocks
==9604==
==9604== For lists of detected and suppressed errors, rerun with: -s
==9604== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
第一个程序没问题
第二个问题来自 getline()
的缓冲区长度参数。您的 my_getline()
总是将其设置为 0,这意味着 getline()
每次都分配一个新缓冲区(至少,对于您正在使用的 glibc 实现;见下文)。改成
int my_getline(FILE* lf_fd, char** lf_buffer, size_t* lf_bufsiz)
{
ssize_t lf_nbytes = 0;
lf_nbytes = getline(lf_buffer, lf_bufsiz, lf_fd);
if (lf_nbytes == -1)
return 1;
return 0;
}
并在使用时传递一个指向最初初始化为0的size_t
变量的指针。 main()
中的现有 bufsize
变量看起来适合使用:
//...
while ((my_getline(lf_fd, &lf_buffer, &bufsize)) == 0)
// ...
虽然解决起来很容易,但您遇到的内存泄漏似乎是 getline()
的 glibc 实现中的错误。
If
*lineptr
is a null pointer or if the object pointed to by*lineptr
is of insufficient size, an object shall be allocated as if bymalloc()
or the object shall be reallocated as if byrealloc()
, respectively, such that the object is large enough to hold the characters to be written to it...
Alternatively, before calling
getline()
,*lineptr
can contain a pointer to amalloc(3)
-allocated buffer*n
bytes in size. If the buffer is not large enough to hold the line,getline()
resizes it withrealloc(3)
, updating*lineptr
and*n
as necessary.
这些表明,在您 运行 的情况下,您将一个有效的非 NULL
指针传递给内存并说它的长度为 0,该函数应该使用realloc()
调整大小。但是,glibc implementation checks *lineptr == NULL || *n == 0
and if true, overwrites *lineptr
with a newly allocated buffer, causing the leak you saw. Compare the NetBSD implementation 对所有分配使用 realloc()
(realloc(NULL, x)
等同于 malloc(x)
),因此不会导致原始代码泄漏。这并不理想,因为它会在每次使用时导致 realloc()
而不是仅在缓冲区不够大以容纳当前行时(与上面的固定版本不同),但它有效。