Name Node 同时作为 Secondary Name Node 是否理想?

Is it ideal for the Name Node to also be the Secondary Name Node?

我正在通过 Raspberry Pi 练习 hadoop 集群,根据本教程(,他在 hadoop masters 文件配置中设置了 node1,这令人困惑,因为他也使用 node 启动hadoop守护进程。我也想知道他配置的原因

P.S。 - 只需 ctrl+f 大师

否 这并不理想。如何配置集群取决于您。在本教程中,作者决定同时使用 node1 作为 P-NN 和 S-NN。请记住,RPi Hadoop 集群仅适用于开发和测试,而不适用于生产环境。

运行 Primary NameNode 和 Secondary NameNode 在单独机器上的优缺点(基于来自 Cloudera 的 This article):

1.Scalability. Creating the system snapshot requires about as much memory as the NameNode itself occupies. Since the memory available to the NameNode process is a primary limit on the size of the distributed filesystem, a large-scale cluster will require most or all of the available memory for the NameNode.

2.Durability. When the SecondaryNameNode creates a checkpoint, it does so in a separate copy of the filesystem metadata. Moving this process to another machine also creates a copy of the metadata file on an independent machine, increasing its durability.