zookeeper datadir 中有什么以及如何清理?

what is in zookeeper datadir and how to cleanup?

我发现我的动物园管理员 dataDir 很大。我想了解

  1. dataDir 中有什么?
  2. 如何清理?一定时间后会自动清理吗?

谢谢

根据Zookeeper's administrator guide

The ZooKeeper Data Directory contains files which are a persistent copy of the znodes stored by a particular serving ensemble. These are the snapshot and transactional log files. As changes are made to the znodes these changes are appended to a transaction log, occasionally, when a log grows large, a snapshot of the current state of all znodes will be written to the filesystem. This snapshot supercedes all previous logs.

简而言之,对于您的第一个问题,您可以假设 dataDir 用于存储 Zookeeper 的状态。

关于你的第二个问题,没有自动清理。来自文档:

A ZooKeeper server will not remove old snapshots and log files, this is the responsibility of the operator. Every serving environment is different and therefore the requirements of managing these files may differ from install to install (backup for example).

The PurgeTxnLog utility implements a simple retention policy that administrators can use. The API docs contains details on calling conventions (arguments, etc...).

In the following example the last count snapshots and their corresponding logs are retained and the others are deleted. The value of should typically be greater than 3 (although not required, this provides 3 backups in the unlikely event a recent log has become corrupted). This can be run as a cron job on the ZooKeeper server machines to clean up the logs daily.

java -cp zookeeper.jar:log4j.jar:conf org.apache.zookeeper.server.PurgeTxnLog <dataDir> <snapDir> -n <count>

如果这是一个开发实例,我猜你几乎可以完全清除该文件夹(除了一些文件,如 myid,如果它在那里的话)。但对于生产实例,您应该遵循上面显示的清理过程。