有人可以帮我在这里找到段错误吗？

Question

编辑：所以，事实证明 'index' 没有返回到 0。那么。这修复了一个段错误。但仍然遇到不同的段错误。正在努力。

node* new_node(void){
    node* ptr = malloc(sizeof(node));
    for (int i = 0; i<27; i++) {
        ptr->next[i] = NULL;
    }
    return ptr;
}
bool load(const char* dictionary)
{
    FILE* dict = fopen(dictionary, "r");
    node* ptr = new_node;
    char word[LENGTH+1];
    int index = 0;
    for (int c = fgetc(dict); c!=EOF; c = fgetc(dict)){
        if(c!='\n'){
            word[index]=c;
            index++;
        }
        else {
            for(int x=0; x<=index; x++){
                int ch = (word[x] == '\'') ? 26 : tolower(word[x])-'a';
                if (ptr->next[ch] == NULL){
                    ptr->next[ch] = new_node;
                }
                ptr = ptr->next[ch];
            }
            ptr->end=true;
        }
    }
    return true;
}

我正在尝试为字典实现一个 trie 数据结构，但我的程序似乎在此函数的某处出现了段错误。即使在 GDB 的帮助下，我似乎也无法确定它，所以有人可以帮助我吗？

节点定义如下：

typedef struct node{
    bool end;
    struct node* next[27];
} node;

词典文件：

a
aaa
aaas
aachen
aalborg
aalesund
aardvark
aardvark's
aardvarks
aardwolf

(...)

Answer 1

行：

node* ptr = new_node;

和

ptr->next[ch] = new_node;

不是在调用函数，而是把函数的地址赋值给ptr。而是调用该函数。

如果启用了编译器警告：-Wall 和 -Wextra，则可以避免此问题。

没有对数组 word 进行边界检查。使用值 LENGTH 在使用前检查索引是否在范围内。

不清楚 for 循环中的 if 语句在做什么。似乎每次找到换行符时，整个数组 word 都会添加到树中，但 index 不会重置，因此会多次添加同一个数组。在某些时候 index 会指向边界之外导致未定义的行为。使用数组 word.

后应重置 index

Answer 2

您忘记在循环开始时将 index 重置为 0。

您还应该使用 calloc(1, sizeof(node)) 而不是 malloc(sizeof(node)) 以避免内存未初始化。我建议您使用 valgrind 来帮助您跟踪代码中的此类问题。

Answer 3

您的代码中有很多问题：

当你用malloc分配内存时，它是未初始化的。分配后直接初始化它，这样 NULL 指针真的是空的。（calloc，“malloc”的表亲，将所有内存初始化为零。）
当你循环这个词时，你不应该包括 index:
```
for (int x = 0; x < index; x++) ...
```
当您找到一个单词的结尾时，您必须将index重置为0。否则，您将追加到旧单词并溢出缓冲区。（您可能还应该强制执行“索引”的上限。）
同样，当你向trie中插入一个词时，你必须重新设置你的指针以便trie遍历到trie的根。这里需要两个指针：根节点指针和遍历trie的辅助指针
照原样，您的 trie 对您的函数而言是本地的。 Return 根节点，以便其他函数可以使用 trie，或者 NULL 失败。

修复这些，您将拥有不崩溃的功能。（它仍然会泄漏内存并且可能无法正确构建 trie。）

    node *load(const char *dictionary)
    {
        FILE *dict = fopen(dictionary, "r");
        node *head = calloc(1, sizeof(node));

        char word[LENGTH + 1];
        int index = 0;

        for (int c = fgetc(dict); c != EOF; c = fgetc(dict)) {
            if (c != '\n') {
                word[index] = c;
                index++;
            } else {
                node *ptr = head;

                for (int x = 0; x < index; x++) {
                    int ch = (word[x] == '\'') ? 26 : tolower(word[x]) - 'a';
                    if (ptr->next[ch] == NULL) {
                        ptr->next[ch] = calloc(1, sizeof(node));
                    }
                    ptr = ptr->next[ch];
                }
                ptr->end = true;
                index = 0;
            }
        }

        return head;
    }

Answer 4

您应该多过滤 punctuation\unsupported 个字符。 [a-z|A-Z|\n|\] 之外的任何字符都会因为

而使您的程序崩溃

int ch = (word[x] == '\'') ? 26 : tolower(word[x])-'a';
if (ptr->next[ch] == NULL){

假设您打开了一个文件，某处可能有 space 或一些意想不到的字符。你需要像

这样的东西

    if(c!='\n'){
        int num = (c == '\'') ? 26 : tolower(c)-'a');
        if(num >=0 && num < 27)
        {
           word[index]=c;
           index++;
        }
    }

有人可以帮我在这里找到段错误吗？

Can someone help me find the segfault here?

c

pointers

trie

segmentation-fault