序列化中的哈希表重复值

Question

我有一个抽象数据库 class，它实现了 Serializable 并且有 2 个方法 read/write 到 .ser 文件。数据库子项的一个示例是 CredentialsUsers，它扩展了 Database 并具有 Hashmap<Credentials,User>。当我打开我的应用程序时，我使用 read 加载数据，完成后我使用 load 保存数据。我的应用程序支持的用户类型之一是具有特殊权限的管理员。在我运行我的应用程序之前，我确保用下面的 initAdmin() 初始化一个管理员。管理员可以检查用户数量等统计数据。如果我 rm *.ser 我的文件和运行应用程序一切顺利，但每次我运行它我都会添加一个 在 Hashmap 中复制 管理员。我已经花了 2 个小时，添加了没有意义的特殊检查（添加一个键，只有在不存在时才在地图中配对等），我仍然找不到为什么数据库允许重复值。有任何想法吗？下面是一些关于reading/writing(Methods of Abstract Database)的代码：

public Object read() {
        Object obj = null;
        try {
            File temp = new File(this.filename);
            temp.createNewFile(); // create file if not present
            FileInputStream fileIn = new FileInputStream(this.filename);
            ObjectInputStream objectIn = new ObjectInputStream(fileIn);
            obj = objectIn.readObject();
            objectIn.close();
        } catch (IOException | ClassNotFoundException e) {
            e.printStackTrace();
        }
        return obj;
    }
public void write() {
        try {
            File temp = new File(this.filename);
            temp.createNewFile(); // create file if not present
            FileOutputStream fileOut = new FileOutputStream(this.filename);
            ObjectOutputStream objectOut = new ObjectOutputStream(fileOut);
            objectOut.writeObject(this);
            objectOut.close();
        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }

这里是管理员初始化：

private void initAdmin() throws NoSuchAlgorithmException {
        Admin admin = new Admin(new Credentials("Edward", "password"),
                "Edward", 25, "fakemail@gmail.com", Gender.MALE, "fakephone");
        if(credentialsUserDatabase.selectUser(admin.getCredentials())==null)//entry not found
            credentialsUserDatabase.insertUser(admin.getCredentials(), admin);
    }

最后是实施：

public class CredentialsUser extends Database {
    @Serial
    private static final long serialVersionUID = 0;
    private HashMap<String, User> users; //Mapping of hash(Credentials) -> User

    public void insertUser(Credentials credentials, User user) throws NoSuchAlgorithmException {
        if (user == null || credentials == null)
            return;
        String hash = Encryption.SHA_512(credentials.toString()); //get credentials hash in a String format
        if (!users.containsKey(hash))
            users.put(hash, user);
    }

你到处都可以看到关于重复键的无用检查，也可能是我从文件中读取数据库的方式与它有关。这是最后一部分可能是问题所在：

private void loadData(){
    Object temp = credentialsUserDatabase.read();
        if (temp != null) //EOF returns null
            credentialsUserDatabase = (CredentialsUser) temp;
}

方法顺序为loadData()->initAdmin()->Application Runs->write()

Answer 1

您期望使用相同参数的 Credentials 的两个实例的 toString 方法的结果相同，但事实并非如此。

String hash = Encryption.SHA_512(credentials.toString()); //get credentials hash in a String format
        if (!users.containsKey(hash))
            users.put(hash, user);

credentials.toString() 将包含新创建的 new Credentials("Edward", "password")

实例的哈希值

默认情况下，hashCode 方法对不同的对象执行 return 不同的整数。这通常是通过将对象的内部地址转换为整数来实现的（参见 doc here）

默认情况下，toString 方法 return 对象的名称 class 加上它的哈希码（参见文档 here）。

getClass().getName() + '@' + Integer.toHexString(hashCode())

因此，当您计算哈希值时，每个新实例都会有所不同。

您可以通过创建多个具有相同参数值的新 Credentials 实例来确保它。

Credentials instance1 = new Credentials("Edward", "password");
Credentials instance2 = new Credentials("Edward", "password");
Console.out.println(instance1.toString());
Console.out.println(instance2.toString());

为了克服这个问题，你需要重写Credentialsclass中的equals和hash方法，所以两个具有相同参数的实例将return 相同的散列。在这种情况下，它们的密钥将相等，您将不会得到重复项。

您可以在相关答案中了解如何做以及为什么需要做：

序列化中的哈希表重复值

Hashtable Duplicate values in Serialization

java

serialization

file

hashmap