哈希 tables，每个哈希 table 单元格上的相同记录

Question

文本文件中有词典。我将散列所有这些词。我写了一些代码但有问题。每个散列 table 记录上出现最后一个词。

main()
{
   FILE *fp;
   char word[100];
   char *hash[569];
   int i;
   for(i=0;i<569;i++)
      hash[i]="NULL";


   int m=569;
   int z =569;
   int mm=568;
   char w;

   int key;
   int j;
   int hash1;
   int hash2;
   fp=fopen("smallDictionary.txt","r");
   int counter =0;
   while(fscanf(fp,"%s",word)!=EOF)
   {
    j=0;
    counter++;

      for(i=key=0;i<strlen(word);i++)
         key+=(word[i]-'a')*26*i;
      hash1=key%m;
      hash2=1+(key%mm);
      while(hash[(hash1+j*hash2)%m]!="NULL")
      {
             j++;          
      }
      hash[(hash1+j*hash2)%m] = word;
   }

  for(i=0;i<569;i++)
      printf("%s ",hash[i]);

   fclose(fp);
}

在控制台上结束结果。

可以看出，作为字典的最后一个词，"your"关键字重复了。

词典内容：

a about above absolutely acceptable add adjacent after algorithm all along also an analyses and any anyone are argument as assignment assume at available be been below bird body but by can cannot capitalization case center chain chaining changing characters check checker checkers checking choose class code collision collisions command compilation compile compiled complexity conform consist containing contains convenience convert correcting corrections create created cross deal debugging december decide deducted deleted departmental dictionary discover discussed divides document documentation due dynamically e each easiest encountered enough error errors etc even exactly example executable expand experience factor fair fall file files find first following font for forth found four friday from function functions g gain general generate generated generating geoff get gird given good graders grows guide guidelines had hair handle has hash hashing have head header help helped hold homework hour how i if in include including incorrect information input insert inserted insertions instructions into is ispell it keeping known kuenning last length less letter letters like line linear list load long longer longest look looked low lower maintained many match may messages method midnight might misspelled mistake mode more most must my name named names necessary need never next no note number of on once one only options or original other otherwise output overview page pair pedantic points policies possibility possible prefer primary probing problem produce produces professor program purpose quack quadratic quick read reason reasonably refer reference rehashing reinitialized report resubmit rid same save separate separated seriously should similar simple simplify single size so something specifications specified spell spelling standard statistics string strong submission submit submitted successfully suggest suggested support suppose table tech th than that the them then there these this those three through thursday times title to together tooth track traditionally transposed trial try turing udder under understand unlike up use used useful using usual variant variants version wall ward warning was way we when whenever which while whitespace who why wild will wind wire wiry with word words works write written wrong you your

Answer 1

哈希表中的所有指针都指向同一个变量：word 数组。无论您输入 word 的最后一个字符串是什么，都将是输出中打印的字符串。

您需要为每个单词分配内存，方法是制作 hash 一个数组数组并将单词复制到辅助数组中，或者使用例如strdup 动态分配字符串。

更形象地说，它可能是这样的：

+---------+
| hash[x] | -\       +------+
+---------+   >--- > | word |
| hash[y] | -/       +------+
+---------+     +--------+
| hash[z] | --> | "NULL" |
+---------+     +--------+

这里条目 x 和 y 都指向 word，而条目 z 指向字符串文字 "NULL" 您将所有条目初始化为.

Answer 2

hash[(hash1+j*hash2)%m] = word;

这只是将 word 的地址分配给哈希条目。这总是一样的。在循环结束时，word 的内容显然将成为 fscanf 读取的最后一个内容。所以你需要这样的东西：

hash[(hash1+j*hash2)%m] = strdup(word);

strdup 分配堆内存，因此当不再需要散列 table 时不要忘记释放它。

哈希 tables，每个哈希 table 单元格上的相同记录

Hash tables, same record on each hash table cell

c

hashtable