cloudera manager中,如何迁移已删除的datanode数据
in cloudera manager, how to migrate deleted datanode data
我已经被 "dfs_hosts_exclude.txt" 排除了数据节点主机 "dn001",并且有效,如何将数据节点数据从这个 "dn001" 迁移到其他数据节点?
你不应该做任何事情。 Hadoop 的 HDFS 应该重新复制数据节点上丢失的任何数据。
来自HDFS Architecture - Data Disk Failure, Heartbeats and Re-Replication
Each DataNode sends a Heartbeat message to the NameNode periodically. A network partition can cause a subset of DataNodes to lose connectivity with the NameNode. The NameNode detects this condition by the absence of a Heartbeat message. The NameNode marks DataNodes without recent Heartbeats as dead and does not forward any new IO requests to them. Any data that was registered to a dead DataNode is not available to HDFS any more. DataNode death may cause the replication factor of some blocks to fall below their specified value. The NameNode constantly tracks which blocks need to be replicated and initiates replication whenever necessary. The necessity for re-replication may arise due to many reasons: a DataNode may become unavailable, a replica may become corrupted, a hard disk on a DataNode may fail, or the replication factor of a file may be increased.
我已经被 "dfs_hosts_exclude.txt" 排除了数据节点主机 "dn001",并且有效,如何将数据节点数据从这个 "dn001" 迁移到其他数据节点?
你不应该做任何事情。 Hadoop 的 HDFS 应该重新复制数据节点上丢失的任何数据。
来自HDFS Architecture - Data Disk Failure, Heartbeats and Re-Replication
Each DataNode sends a Heartbeat message to the NameNode periodically. A network partition can cause a subset of DataNodes to lose connectivity with the NameNode. The NameNode detects this condition by the absence of a Heartbeat message. The NameNode marks DataNodes without recent Heartbeats as dead and does not forward any new IO requests to them. Any data that was registered to a dead DataNode is not available to HDFS any more. DataNode death may cause the replication factor of some blocks to fall below their specified value. The NameNode constantly tracks which blocks need to be replicated and initiates replication whenever necessary. The necessity for re-replication may arise due to many reasons: a DataNode may become unavailable, a replica may become corrupted, a hard disk on a DataNode may fail, or the replication factor of a file may be increased.