Name Node 同时作为 Secondary Name Node 是否理想?

Is it ideal for the Name Node to also be the Secondary Name Node?

我正在通过 Raspberry Pi 练习 hadoop 集群,根据本教程(http://www.widriksson.com/raspberry-pi-hadoop-cluster/),他在 hadoop masters 文件配置中设置了 node1,这令人困惑,因为他也使用 node 启动hadoop守护进程。我也想知道他配置的原因

P.S。 - 只需 ctrl+f 大师

否 这并不理想。如何配置集群取决于您。在本教程中,作者决定同时使用 node1 作为 P-NN 和 S-NN。请记住,RPi Hadoop 集群仅适用于开发和测试,而不适用于生产环境。

运行 Primary NameNode 和 Secondary NameNode 在单独机器上的优缺点(基于来自 Cloudera 的 This article):

1.Scalability. Creating the system snapshot requires about as much memory as the NameNode itself occupies. Since the memory available to the NameNode process is a primary limit on the size of the distributed filesystem, a large-scale cluster will require most or all of the available memory for the NameNode.

2.Durability. When the SecondaryNameNode creates a checkpoint, it does so in a separate copy of the filesystem metadata. Moving this process to another machine also creates a copy of the metadata file on an independent machine, increasing its durability.