我如何理解 valgrind 输出的内存泄漏？

Question

我正在参加 CS50 class，我在第 5 周尝试弄清楚 Pset5 Speller。对于不熟悉的人，目标是编辑一个特定的 .c 文件以正确地获得五个函数运行以便主要函数（位于单独的文件中）可以执行以下操作：

LOAD - 将字典加载到散列中 table
HASH - 运行一个词通过这个函数来帮助将它载入字典或者稍后搜索这个词
SIZE - 查看词典中有多少个单词
CHECK - 查看文本中的单词是否在字典中
UNLOAD - 释放字典以防止内存泄漏

请注意，该文件是在 class 中提供给我的，我将在函数中编辑 space - 我唯一可以更改的是 const unsigned int N = 1000;我设置为 1000 只是一个任意数字，但它可以是任何数字。

我只有一件事有问题（我可以说）。我已尽一切努力做到运行，但 Check50（判断我是否正确执行的程序）告诉我我有内存错误：

Results for cs50/problems/2021/x/speller generated by check50 v3.3.0
:) dictionary.c exists
:) speller compiles
:) handles most basic words properly
:) handles min length (1-char) words
:) handles max length (45-char) words
:) handles words with apostrophes properly
:) spell-checking is case-insensitive
:) handles substrings properly
:( program is free of memory errors
    valgrind tests failed; see log for more information.

当我运行 valgrind 这就是它给我的：

==347== 
==347== HEAP SUMMARY:
==347==     in use at exit: 472 bytes in 1 blocks
==347==   total heap usage: 143,096 allocs, 143,095 frees, 8,023,256 bytes allocated
==347== 
==347== 472 bytes in 1 blocks are still reachable in loss record 1 of 1
==347==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==347==    by 0x4A29AAD: __fopen_internal (iofopen.c:65)
==347==    by 0x4A29AAD: fopen@@GLIBC_2.2.5 (iofopen.c:86)
==347==    by 0x401B6E: load (dictionary.c:83)
==347==    by 0x4012CE: main (speller.c:40)
==347== 
==347== LEAK SUMMARY:
==347==    definitely lost: 0 bytes in 0 blocks
==347==    indirectly lost: 0 bytes in 0 blocks
==347==      possibly lost: 0 bytes in 0 blocks
==347==    still reachable: 472 bytes in 1 blocks
==347==         suppressed: 0 bytes in 0 blocks
==347== 
==347== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

这对我来说似乎很神秘，我希望有人能帮助解释并帮助我解决我的问题（而 Help50 没有任何建议）。

这是我的实际代码（请记住，还有第二个文档的主要功能实际使用了所提供的功能，因此没关系，例如功能似乎顺序不正确）。

// Implements a dictionary's functionality

#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

#include "dictionary.h"

// Represents a node in a hash table
typedef struct node
{
    char word[LENGTH + 1];
    struct node *next;
}
node;

// Number of buckets in hash table
const unsigned int N = 1000;

// Hash table
node *table[N];

// Dictionary size
int dictionary_size = 0;

// Returns true if word is in dictionary, else false
bool check(const char *word)
{
    // TODO #4!
    
    // make lowercase copy of word
    char copy[strlen(word) + 1];
    for (int i = 0; word[i]; i++)
    {
        copy[i] = tolower(word[i]);
    }
    copy[strlen(word)] = '[=12=]';
    
    // get hash value
    int h = hash(copy);

    // use hash value to see if word is in bucket
    if (table[h] != NULL)
    {
        node *temp = table[h];
        
        while (temp != NULL)
        {
            if (strcmp(temp->word, copy) == 0)
            {
                return true;
            }
            
            temp = temp->next;
        }
    }
    
    return false;
}

// Hashes word to a number
unsigned int hash(const char *word)
{
    // TODO #2
    // source: https://www.reddit.com/r/cs50/comments/1x6vc8/pset6_trie_vs_hashtable/cf9189q/
    // I used this source because I had trouble understanding different variations - this one explained everything well.
    // I modified it slightly to fit my needs
    unsigned int h = 0;
    for (int i = 0; i < strlen(word); i++)
    {
        h = (h << 2) ^ word[i];
    }
    return h % N;
}

// Loads dictionary into memory, returning true if successful, else false
bool load(const char *dictionary)
{
    // TODO #1!
    // open dictionary file
    FILE *file = fopen(dictionary, "r");
    if (file == NULL)
    {
        return false;
    }
    
    // read strings from file one at a time
    char word[LENGTH + 1];
    while (fscanf(file, "%s", word) != EOF)
    {
        node *n = malloc(sizeof(node));
        if (n == NULL)
        {
            return false;
        }
        
        // place word into node
        strcpy(n->word, word);
        
        // use hash function to take string and return an index
        int h = hash(word);

        // make the current node point to the bucket we want
        n->next = table[h];
        
        // make the bucket start now with the current node
        table[h] = n;
        
        //count number of words loaded
        dictionary_size++;
    }

    return true;
}

// Returns number of words in dictionary if loaded, else 0 if not yet loaded
unsigned int size(void)
{
    // TODO #3!
    return dictionary_size;
}

// Unloads dictionary from memory, returning true if successful, else false
bool unload(void)
{
    // TODO #5!
    for (int i = 0; i < N; i++)
    {
        while (table[i] != NULL)
        {
            node *temp = table[i]->next;
            free(table[i]);
            table[i] = temp;
        }
    }
    return true;
}

Answer 1

就像我们必须 free 我们 malloc 的每个指针一样，我们必须 fclose 我们 fopen 的每个 FILE*。

你的问题源于这一行：

FILE *file = fopen(dictionary, "r");

没有对应的fclose(file)调用。将此添加到 loads 函数的末尾，在 return.

之前

Valgrind 可以提供非常有用的调试信息（特别是当您的代码使用 -g 编译时用于调试信息），例如您问题的摘录：

==347== 472 bytes in 1 blocks are still reachable in loss record 1 of 1
==347==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==347==    by 0x4A29AAD: __fopen_internal (iofopen.c:65)
==347==    by 0x4A29AAD: fopen@@GLIBC_2.2.5 (iofopen.c:86)
==347==    by 0x401B6E: load (dictionary.c:83)
==347==    by 0x4012CE: main (speller.c:40)

Valgrind 为您提供 分配内存 最终泄漏的堆栈跟踪 - 您可以看到您自己的代码中的最后一行是 dictionary.c:83 这是调用 fopen 的行。

我如何理解 valgrind 输出的内存泄漏？

How can I understand a memory leak from valgrind output?

c

memory

hash

dictionary

cs50