Elk on Docker Swarm 和 glusterFS 崩溃
Elk on Docker Swarm and glusterFS crash
我正在尝试在 docker 群上部署 ELK 堆栈。
如果我将弹性数据目录绑定到 Docker 卷就没有问题。
当我尝试将 elstastic 数据目录绑定到 glusterFS 卷时,问题就来了。
我使用 glusterFS 来同步集群中所有 swarm 节点之间的数据。
我使用以下代码部署 ELK:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:6.2.3
# container_name: elasticsearch
environment:
- "http.host=0.0.0.0"
- "transport.host=127.0.0.1"
- "ELASTIC_PASSWORD=changeme"
- "TAKE_FILE_OWNERSHIP=1"
ports: ['127.0.0.1:9200:9200']
volumes:
- /opt/dockershared/stack-elk/elk:/usr/share/elasticsearch/data
networks: ['stack']
目录“/opt/dockershared/”是一个 glusterFS 卷:
myhost:/gvol0 on /opt/dockershared type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072,_netdev)
ELK 堆栈启动没有问题,但在 30/60 分钟后分片分配失败。
在 ELK 日志中,我看到以下异常:
[2018-04-13T08:58:16,749][WARN ][o.e.i.e.Engine ] [MPxFOvC] [metricbeat-6.2.3-2018.04.13][0] failed engine [refresh failed source[schedule]]
org.apache.lucene.index.CorruptIndexException: Problem reading index from store(MMapDirectory@/usr/share/elasticsearch/data/nodes/0/indices/fRcersH4RjecZ8AKb3WZTQ/0/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@73620ce7) (resource=store(MMapDirectory@/usr/share/elasticsearch/data/nodes/0/indices/fRcersH4RjecZ8AKb3WZTQ/0/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@73620ce7))
at org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:140) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]
......
Caused by: java.io.EOFException: read past EOF: MMapIndexInput(path="/usr/share/elasticsearch/data/nodes/0/indices/fRcersH4RjecZ8AKb3WZTQ/0/index/_47.cfe")
at org.apache.lucene.store.ByteBufferIndexInput.readByte(ByteBufferIndexInput.java:75) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]
......
Suppressed: org.apache.lucene.index.CorruptIndexException: checksum status indeterminate: remaining=0, please run checkindex for more details (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/usr/share/elasticsearch/data/nodes/0/indices/fRcersH4RjecZ8AKb3WZTQ/0/index/_47.cfe")))
.....
可能是什么问题?
在所有 swarm 节点之间共享弹性数据目录的最佳解决方案是什么?
谢谢
我在 ELK 论坛上写过,这是答案:
elk forum
基本上ELK只支持本地磁盘或者块存储。
我正在尝试在 docker 群上部署 ELK 堆栈。
如果我将弹性数据目录绑定到 Docker 卷就没有问题。
当我尝试将 elstastic 数据目录绑定到 glusterFS 卷时,问题就来了。 我使用 glusterFS 来同步集群中所有 swarm 节点之间的数据。 我使用以下代码部署 ELK:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:6.2.3
# container_name: elasticsearch
environment:
- "http.host=0.0.0.0"
- "transport.host=127.0.0.1"
- "ELASTIC_PASSWORD=changeme"
- "TAKE_FILE_OWNERSHIP=1"
ports: ['127.0.0.1:9200:9200']
volumes:
- /opt/dockershared/stack-elk/elk:/usr/share/elasticsearch/data
networks: ['stack']
目录“/opt/dockershared/”是一个 glusterFS 卷:
myhost:/gvol0 on /opt/dockershared type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072,_netdev)
ELK 堆栈启动没有问题,但在 30/60 分钟后分片分配失败。 在 ELK 日志中,我看到以下异常:
[2018-04-13T08:58:16,749][WARN ][o.e.i.e.Engine ] [MPxFOvC] [metricbeat-6.2.3-2018.04.13][0] failed engine [refresh failed source[schedule]] org.apache.lucene.index.CorruptIndexException: Problem reading index from store(MMapDirectory@/usr/share/elasticsearch/data/nodes/0/indices/fRcersH4RjecZ8AKb3WZTQ/0/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@73620ce7) (resource=store(MMapDirectory@/usr/share/elasticsearch/data/nodes/0/indices/fRcersH4RjecZ8AKb3WZTQ/0/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@73620ce7)) at org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:140) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43] ...... Caused by: java.io.EOFException: read past EOF: MMapIndexInput(path="/usr/share/elasticsearch/data/nodes/0/indices/fRcersH4RjecZ8AKb3WZTQ/0/index/_47.cfe") at org.apache.lucene.store.ByteBufferIndexInput.readByte(ByteBufferIndexInput.java:75) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43] ...... Suppressed: org.apache.lucene.index.CorruptIndexException: checksum status indeterminate: remaining=0, please run checkindex for more details (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/usr/share/elasticsearch/data/nodes/0/indices/fRcersH4RjecZ8AKb3WZTQ/0/index/_47.cfe"))) .....
可能是什么问题? 在所有 swarm 节点之间共享弹性数据目录的最佳解决方案是什么?
谢谢
我在 ELK 论坛上写过,这是答案: elk forum
基本上ELK只支持本地磁盘或者块存储。