geth 不会在不正常的系统重启时将 trie 节点数据从内存保存到磁盘

geth does not persist trie node data from memory to disk on ungraceful system restart

问题:geth 1.8.22 在系统重启时从第一个区块而不是最后一个区块开始挖掘。

我们有什么

我们有 3 个同步的私有 geth 节点使用 PoA(clique)。

发生了什么

有一天(一周前)我们的托管服务提供商出现问题,因此我们不得不重新启动 3 个节点中的 2 个(每个节点都在单独的 VPS 上)。当前块是 4 000 000。当节点 1 和节点 2 重新启动时,它们从块 372 开始挖掘而不是最后一个 4 000 000。

为什么会这样(我的猜测)

Geth 1.8.22 将一些数据与 trie 节点数据保存在 RAM 而不是磁盘中。在优雅的节点关闭(例如从控制台)时,这个 trie 节点数据从 RAM 保存到硬盘驱动器。在强制系统关闭(例如从托管管理面板)时,trie 节点数据没有时间保存在硬盘上。我们的节点 运行ning 已经 6 个月没有任何重启,所以我认为这个 trie 节点数据一直保存在 RAM 中,并且在系统重启时消失了(尽管我们仍然有节点 3 正在运行并且运行宁).

日志

这是我尝试运行其中一个节点的备份版本时的日志:

vladimir@comp:~/Public/projects/ethereum/repro-geth-bug/geth-linux-amd64-1.8.22-7fa3509e$ ./geth --datadir ../opt/ethereum/data/ --networkid 1515 --unlock 0xd6ee38421e1713dd50e888c6d689b82953946bc3 --password ../opt/ethereum/unlock_password --port 30306 --mine
INFO [11-21|17:06:25.374] Maximum peer count                       ETH=25 LES=0 total=25
INFO [11-21|17:06:25.374] Starting peer-to-peer node               instance=Geth/v1.8.22-stable-7fa3509e/linux-amd64/go1.11.5
INFO [11-21|17:06:25.374] Allocated cache and file handles         database=/home/vladimir/Public/projects/ethereum/repro-geth-bug/opt/ethereum/data/geth/chaindata cache=512 handles=2048
INFO [11-21|17:06:26.550] Initialised chain configuration          config="{ChainID: 1515 Homestead: 1 DAO: <nil> DAOSupport: false EIP150: 2 EIP155: 3 EIP158: 3 Byzantium: 4 Constantinople: 5  ConstantinopleFix: <nil> Engine: clique}"
INFO [11-21|17:06:26.550] Initialising Ethereum protocol           versions="[63 62]" network=1515
WARN [11-21|17:06:26.579] Head state missing, repairing chain      number=4073749 hash=9bfb53…56d503
INFO [11-21|17:07:45.179] Rewound blockchain to past state         number=371     hash=102018…d91947
INFO [11-21|17:07:45.180] Loaded most recent local header          number=4073749 hash=9bfb53…56d503 td=8147499 age=2d5h43m
INFO [11-21|17:07:45.180] Loaded most recent local full block      number=371     hash=102018…d91947 td=743     age=7mo3w6d
INFO [11-21|17:07:45.180] Loaded most recent local fast block      number=4073749 hash=9bfb53…56d503 td=8147499 age=2d5h43m
INFO [11-21|17:07:45.180] Loaded local transaction journal         transactions=3 dropped=3
INFO [11-21|17:07:45.180] Regenerated local transaction journal    transactions=0 accounts=0
WARN [11-21|17:07:45.180] Blockchain not empty, fast sync disabled 
INFO [11-21|17:07:45.623] New local node record                    seq=6 id=e8c5a9e8848d4e30 ip=127.0.0.1 udp=30306 tcp=30306
INFO [11-21|17:07:45.623] Started P2P networking                   self=enode://9647000ba2579dd529574b49f472f029839a09257c1bc3ade5135cbbb5f3ceaf1237aff5b6b947d2fa4f218fa24858dc2767bd4b78e082b04c9d013c1482cfa6@127.0.0.1:30306
INFO [11-21|17:07:45.624] IPC endpoint opened                      url=/home/vladimir/Public/projects/ethereum/repro-geth-bug/opt/ethereum/data/geth.ipc
INFO [11-21|17:07:46.192] Unlocked account                         address=0xd6ee38421e1713dD50E888c6D689B82953946bC3
INFO [11-21|17:07:46.192] Transaction pool price threshold updated price=1000000000
INFO [11-21|17:07:46.192] Transaction pool price threshold updated price=1000000000
INFO [11-21|17:07:46.192] Etherbase automatically configured       address=0xd6ee38421e1713dD50E888c6D689B82953946bC3
INFO [11-21|17:07:46.192] Commit new mining work                   number=372     sealhash=685e15…2c52df uncles=0 txs=0 gas=0 fees=0 elapsed=75.951µs
INFO [11-21|17:07:46.192] Successfully sealed new block            number=372     sealhash=685e15…2c52df hash=0c60ef…f29e6b elapsed=385.27µs
INFO [11-21|17:07:46.192]  mined potential block                  number=372     hash=0c60ef…f29e6b
INFO [11-21|17:07:46.193] Commit new mining work                   number=373     sealhash=337ae5…2b4704 uncles=0 txs=0 gas=0 fees=0 elapsed=222.362µs
INFO [11-21|17:07:47.962] Mapped network port                      proto=tcp extport=30306 intport=30306 interface="UPNP IGDv1-IP1"
INFO [11-21|17:07:48.391] Mapped network port                      proto=udp extport=30306 intport=30306 interface="UPNP IGDv1-IP1"
INFO [11-21|17:07:49.625] New local node record                    seq=7 id=e8c5a9e8848d4e30 ip=128.71.103.50 udp=30306 tcp=30306
INFO [11-21|17:07:51.001] Successfully sealed new block            number=373     sealhash=337ae5…2b4704 hash=b67668…81f164 elapsed=4.807s
INFO [11-21|17:07:51.001]  mined potential block                  number=373     hash=b67668…81f164
INFO [11-21|17:07:51.002] Commit new mining work                   number=374     sealhash=c0e9f6…628d51 uncles=0 txs=0 gas=0 fees=0 elapsed=1.434ms
INFO [11-21|17:07:56.001] Successfully sealed new block            number=374     sealhash=c0e9f6…628d51 hash=77aae2…9c44e8 elapsed=4.998s
INFO [11-21|17:07:56.001]  mined potential block                  number=374     hash=77aae2…9c44e8
INFO [11-21|17:07:56.003] Commit new mining work                   number=375     sealhash=6f7db7…adca12 uncles=0 txs=0 gas=0 fees=0 elapsed=1.305ms
^CINFO [11-21|17:07:58.483] Got interrupt, shutting down... 
INFO [11-21|17:07:58.483] IPC endpoint closed                      url=/home/vladimir/Public/projects/ethereum/repro-geth-bug/opt/ethereum/data/geth.ipc
INFO [11-21|17:07:58.483] Writing cached state to disk             block=374 hash=77aae2…9c44e8 root=e16e04…e93be1
INFO [11-21|17:07:58.483] Persisted trie from memory database      nodes=0 size=0.00B time=7.185µs gcnodes=0 gcsize=0.00B gctime=0s livenodes=1 livesize=0.00B
INFO [11-21|17:07:58.483] Writing cached state to disk             block=373 hash=b67668…81f164 root=e16e04…e93be1
INFO [11-21|17:07:58.483] Persisted trie from memory database      nodes=0 size=0.00B time=2.571µs gcnodes=0 gcsize=0.00B gctime=0s livenodes=1 livesize=0.00B
INFO [11-21|17:07:58.484] Writing cached state to disk             block=247 hash=7b422a…5f9a62 root=e16e04…e93be1
INFO [11-21|17:07:58.484] Persisted trie from memory database      nodes=0 size=0.00B time=2.784µs gcnodes=0 gcsize=0.00B gctime=0s livenodes=1 livesize=0.00B
INFO [11-21|17:07:58.484] Blockchain manager stopped 
INFO [11-21|17:07:58.484] Stopping Ethereum protocol 
INFO [11-21|17:07:58.484] Ethereum protocol stopped 
INFO [11-21|17:07:58.484] Transaction pool stopped 
INFO [11-21|17:07:58.497] Database closed                          database=/home/vladimir/Public/projects/ethereum/repro-geth-bug/opt/ethereum/data/geth/chaindata

如何修复

首先想到的是每天通过 cron 重新启动 geth 节点(优雅地),以便节点将 trie 节点数据保存在磁盘上。

所以问题是:如何处理UN优雅的系统关闭,以便geth节点持久化数据并保持从最新区块开始挖掘重新开始?可能我遗漏了什么或者对 geth 的工作原理有错误的理解。

希望你明白我的意思,提前致谢

请查看完整答案:https://github.com/ethereum/go-ethereum/issues/20383#issuecomment-558107815

简而言之:

  • geth 在块处理 1 小时后保留数据
  • 如果您的网络超轻(即大部分是空块),块从内存刷新到硬盘需要很长时间
  • 目前无法在geth中配置持久化周期

解决方案:定期重启 geth,以便将数据从 RAM 保存到硬盘