在循环中初始化一个唯一指针（用于散列table）

Question

在过去的几年里，我一直在 Python 使用字典进行数字运算。几年后我正在重述 C，我正在尝试使用散列 table 来模仿 python 字典功能。我的散列 table 工作正常，但我正在努力在循环中以编程方式添加新条目。这是我的代码：

#define TABLE_SIZE 16

typedef struct item{
    char key[64];
    int value;
    int list[5];
    struct item *next;
} item;

// Create hash table array
item *hash_table[TABLE_SIZE];

void init_hash_table(){
    for (int i = 0; i < TABLE_SIZE; i++)
        hash_table[i] = NULL;
}

int hash(char *key){
    unsigned long H = 5381;
    int c;
    while ((c = *key++))
        H = ((H << 5) + H) + c;
    H = H % TABLE_SIZE;
    int hash = (int)H;
    return hash;
}

int insert_item(item *pItem){
    if (pItem == NULL){
        printf("Invalid item");
        return -1;
    }
    else {
        int index = hash(pItem->key);
        pItem->next = hash_table[index];
        hash_table[index] = pItem;
        return index;
    }
}

void print_table(){
    printf("\n--- Table Start ---\n");
    for (int i = 0; i < TABLE_SIZE; i++){
        if (hash_table[i] == NULL)
            printf("\t %d \t --- \t\n", i);
        else {
            item *tmp = hash_table[i];
            printf("\t %d ", i);
            while(tmp != NULL){
                printf("\t --- \t %s", tmp->key);
                tmp = tmp->next;
            }
            printf("\n");
        }
    }
    printf("--- Table End ---");
}

int main(){

    // Initialise hash table
    init_hash_table();
    char letters[] = "ABCDEFGHIJ";

    // Insert data
    for (int i = 0; i < 10; i++){
        item itemN = {.key = letters[i]};
        printf("%s ", itemN.key);
        printf("%p\n", &itemN);
        insert_item(&itemN);
    }

    // Print the table
    print_table();

    return 0;
}

基本上我想将 10 个条目添加到散列 table，每个条目都是唯一的，键是不同的字母（A 到 J）。问题是 main 中的每个 itemN 在内存中都有相同的地址，因此每次都会覆盖相同的指针。因此我的输出看起来像：

A 0061FEB0
B 0061FEB0
C 0061FEB0
D 0061FEB0
E 0061FEB0
F 0061FEB0
G 0061FEB0
H 0061FEB0
I 0061FEB0
J 0061FEB0

--- Table Start ---
         0       ---
         1       ---
         2       ---
         3       ---     J
         4       ---
         5       ---
         6       ---     J
         7       ---     J
         8       ---     J
         9       ---     J
         10      ---     J
         11      ---     J
         12      ---     J
         13      ---     J
         14      ---     J
         15      ---
--- Table End ---

问题是，如何在循环中生成一个具有唯一地址的新项目变量？这似乎是实现起来非常标准的功能，因为我们不能总是手动初始化变量。对不起，如果我遗漏了一些明显的东西！

谢谢

Answer 1

代码不断插入相同的项目。 &itemN 是本地对象的地址 itemsN。该项目超出范围，在循环 { ...}

后不再有效

for (int i = 0; i < 10; i++){
  item itemN = {.key = letters[i]};
  ...
  insert_item(&itemN);
}
print_table();  // no longer valid for code to use `&itemN`.

.key = letters[i] 也是无效的，因为它试图将 char 分配给 char *。这意味着警告未完全启用。节省时间，全部启用。

也许插入 10 个不同的 item。

item items[10] = { 0 };
for (int i = 0; i < 10; i++){
    items[i].key[0] = letters[i];
    printf("%s ", items[i].key);
    printf("%p\n", &item[i]);
    insert_item(&items[i]);
}

GTG

Answer 2

基本问题是你的散列 table 实际上并不存储项目，它存储指针，而你的 insert_item 函数接受一个指针并将该指针直接插入散列 table，这意味着它只有在该对象还活着时才有效。当对象的生命周期结束时，指针变得悬空并且您的散列 table 变得无效，因此之后的几乎所有行为都是未定义的行为。

无论何时在 C 中处理指针（几乎所有时间），您都需要了解所指向对象的生命周期，并确保指针在生命周期结束后不会被使用.在您的情况下，您将 item 定义为 main 循环内的本地（自动），因此生命周期将在循环迭代结束后立即结束。因此，当您将第二个项目放入散列 table 时，您在 table 中有一个悬空指针，并且所有赌注都已关闭。

通常的处理方法是让您的散列 table“拥有”table 中的对象，并让您的 insert_item 复制项目（使用 malloc 为副本分配一些内存）：

int insert_item(item *pItem){
    if (pItem == NULL){
        printf("Invalid item");
        return -1;
    }
    else {
        int index = hash(pItem->key);
        item *copy = malloc(sizeof(item));
        if (copy == NULL) {
            printf("Ran out of memory\n");
            return -1; }
        *copy = *pItem;
        copy->next = hash_table[index];
        hash_table[index] = copy;
        return index;
    }
}

这样，实际上在散列 table 中的所有项目对象都将具有延长的生命周期——它们将一直存在，直到您明确释放它们。当你完成哈希 table 本身时，你想这样做，所以你可能有一个 destroy_table 函数来做到这一点。

在循环中初始化一个唯一指针（用于散列table）

Initialising a unique pointer in a loop (to be used in a hash table)

c

pointers

hashtable

hashmap