hdfs + namenode + edit files increased with huge size 以及如何限制编辑文件的大小
hdfs + namenode + edit files increasing with huge size and how to limit the size of edit files
我们有 7 个数据节点机器的 HDP 集群
在/hadoop/hdfs/namenode/current/
之下
我们可以看到更多然后 1500
编辑文件
每个文件都在7M
到20M
左右,如下
7.8M /hadoop/hdfs/namenode/current/edits_0000000002331008695-0000000002331071883
7.0M /hadoop/hdfs/namenode/current/edits_0000000002331071884-0000000002331128452
7.8M /hadoop/hdfs/namenode/current/edits_0000000002331128453-0000000002331189702
7.1M /hadoop/hdfs/namenode/current/edits_0000000002331189703-0000000002331246584
11M /hadoop/hdfs/namenode/current/edits_0000000002331246585-0000000002331323246
8.0M /hadoop/hdfs/namenode/current/edits_0000000002331323247-0000000002331385595
7.7M /hadoop/hdfs/namenode/current/edits_0000000002331385596-0000000002331445237
7.9M /hadoop/hdfs/namenode/current/edits_0000000002331445238-0000000002331506718
9.1M /hadoop/hdfs/namenode/current/edits_0000000002331506719-0000000002331573154
9.0M /hadoop/hdfs/namenode/current/edits_0000000002331573155-0000000002331638086
7.8M /hadoop/hdfs/namenode/current/edits_0000000002331638087-0000000002331697435
7.8M /hadoop/hdfs/namenode/current/edits_0000000002331697436-0000000002331755881
8.0M /hadoop/hdfs/namenode/current/edits_0000000002331755882-0000000002331814933
9.8M /hadoop/hdfs/namenode/current/edits_0000000002331814934-0000000002331884369
11M /hadoop/hdfs/namenode/current/edits_0000000002331884370-0000000002331955341
8.7M /hadoop/hdfs/namenode/current/edits_0000000002331955342-0000000002332019335
7.8M /hadoop/hdfs/namenode/current/edits_0000000002332019336-0000000002332074498
是否可以通过某些 HDFS
配置来最小化文件大小? (或最小化编辑文件编号)
因为我们有小磁盘并且磁盘现在是 100%
/dev/sdb 100G 100G 0 100% /hadoop/hdfs
您可以配置 dfs.namenode.num.checkpoints.retained
和
dfs.namenode.num.extra.edits.retained
控制大小的属性
保存 NameNode 编辑目录的目录。
dfs.namenode.num.checkpoints.retained
: The number of image checkpoint
files that are retained in storage directories. All edit logs
necessary to recover an up-to-date namespace from the oldest retained
checkpoint are also retained.
dfs.namenode.num.extra.edits.retained
: The number of extra transactions that should be retained beyond what is minimally
necessary for a NameNode restart. This can be useful for audit
purposes, or for an HA setup where a remote Standby Node may have been
offline for some time and require a longer backlog of retained edits
in order to start again.
我们有 7 个数据节点机器的 HDP 集群
在/hadoop/hdfs/namenode/current/
我们可以看到更多然后 1500
编辑文件
每个文件都在7M
到20M
左右,如下
7.8M /hadoop/hdfs/namenode/current/edits_0000000002331008695-0000000002331071883
7.0M /hadoop/hdfs/namenode/current/edits_0000000002331071884-0000000002331128452
7.8M /hadoop/hdfs/namenode/current/edits_0000000002331128453-0000000002331189702
7.1M /hadoop/hdfs/namenode/current/edits_0000000002331189703-0000000002331246584
11M /hadoop/hdfs/namenode/current/edits_0000000002331246585-0000000002331323246
8.0M /hadoop/hdfs/namenode/current/edits_0000000002331323247-0000000002331385595
7.7M /hadoop/hdfs/namenode/current/edits_0000000002331385596-0000000002331445237
7.9M /hadoop/hdfs/namenode/current/edits_0000000002331445238-0000000002331506718
9.1M /hadoop/hdfs/namenode/current/edits_0000000002331506719-0000000002331573154
9.0M /hadoop/hdfs/namenode/current/edits_0000000002331573155-0000000002331638086
7.8M /hadoop/hdfs/namenode/current/edits_0000000002331638087-0000000002331697435
7.8M /hadoop/hdfs/namenode/current/edits_0000000002331697436-0000000002331755881
8.0M /hadoop/hdfs/namenode/current/edits_0000000002331755882-0000000002331814933
9.8M /hadoop/hdfs/namenode/current/edits_0000000002331814934-0000000002331884369
11M /hadoop/hdfs/namenode/current/edits_0000000002331884370-0000000002331955341
8.7M /hadoop/hdfs/namenode/current/edits_0000000002331955342-0000000002332019335
7.8M /hadoop/hdfs/namenode/current/edits_0000000002332019336-0000000002332074498
是否可以通过某些 HDFS
配置来最小化文件大小? (或最小化编辑文件编号)
因为我们有小磁盘并且磁盘现在是 100%
/dev/sdb 100G 100G 0 100% /hadoop/hdfs
您可以配置 dfs.namenode.num.checkpoints.retained
和
dfs.namenode.num.extra.edits.retained
控制大小的属性
保存 NameNode 编辑目录的目录。
dfs.namenode.num.checkpoints.retained
: The number of image checkpoint files that are retained in storage directories. All edit logs necessary to recover an up-to-date namespace from the oldest retained checkpoint are also retained.dfs.namenode.num.extra.edits.retained
: The number of extra transactions that should be retained beyond what is minimally necessary for a NameNode restart. This can be useful for audit purposes, or for an HA setup where a remote Standby Node may have been offline for some time and require a longer backlog of retained edits in order to start again.