Cassandra 驱动程序异常 "All host(s) tried for query failed" 每隔几个小时发生一次,没有任何解释
Cassandra driver exception "All host(s) tried for query failed" occurs every few hours without explanation
我的 Cassandra 集群(4 节点集群)有问题。 Cassandra版本为2.2.9,驱动版本为3.0.3.
几个小时后(~ 3 小时),我在驱动程序日志中看到以下问题:
- OutOfDirectMemoryError(偶尔发生,大部分时间没有影响)
- 没有与整数版本匹配的协议版本
- 未知响应操作码
- 心跳查询超时
- 所有尝试查询的主机均失败 --> 无法再查询 Cassandra
Cassandra 集群是健康的,当我重新启动应用程序时,一切都会再次运行几个小时。
日志片段:
First Time Count Message
2017-11-11 19:03:03 +0100 51 [/??.???.??.??:????] preparing to open ? new connections, total = ???
2017-11-11 19:03:03 +0100 49 [/??.???.??.??:????] Connection[/??.???.??.??:????-???, inFlight=?, closed=false] Transport initialized, connection ready
2017-11-11 19:03:03 +0100 24 [/??.???.??.??:????] Connection[/??.???.??.??:????-???, inFlight=?, closed=true] closed, remaining = ???
2017-11-11 19:03:29 +0100 1 Defuncting Connection[/??.???.??.??:????-???, inFlight=???, closed=false] because: [/??.???.??.??:????] Unexpected exception triggered (io.netty.handler.codec.DecoderException: io.netty.util.internal.OutOfDirectMemoryError: failed to allocate ??????? byte(s) of direct memory (used: ???????, max: ????????))
2017-11-11 19:03:29 +0100 14 [/??.???.??.??:????] Connection[/??.???.??.??:????-???, inFlight=???, closed=false] failed, remaining = ???
2017-11-11 19:03:29 +0100 7 [/??.???.??.??:????] Connection[/??.???.??.??:????-???, inFlight=??, closed=false] failed, remaining = ???
2017-11-11 19:03:29 +0100 1 Defuncting Connection[/??.???.??.??:????-???, inFlight=??, closed=false] because: [/??.???.??.??:????] Unexpected exception triggered (io.netty.handler.codec.DecoderException: java.lang.IllegalArgumentException: No protocol version matching integer version ?)
2017-11-11 19:03:29 +0100 5 Defuncting Connection[/??.???.??.??:????-???, inFlight=??, closed=false] because: [/??.???.??.??:????] Unexpected exception triggered (io.netty.handler.codec.DecoderException: com.datastax.driver.core.exceptions.DriverInternalError: Unknown response opcode ??)
2017-11-11 19:03:29 +0100 4 Defuncting Connection[/??.???.??.??:????-???, inFlight=???, closed=false] because: [/??.???.??.??:????] Unexpected exception triggered (io.netty.handler.codec.DecoderException: com.datastax.driver.core.exceptions.DriverInternalError: Unknown response opcode ?)
2017-11-11 19:03:29 +0100 3 Defuncting Connection[/??.???.??.??:????-???, inFlight=???, closed=false] because: [/??.???.??.??:????] Unexpected exception triggered (io.netty.handler.codec.DecoderException: com.datastax.driver.core.exceptions.DriverInternalError: Unknown response opcode -???)
2017-11-11 19:03:30 +0100 3 Defuncting Connection[/??.???.??.??:????-???, inFlight=???, closed=false] because: [/??.???.??.??:????] Unexpected exception triggered (io.netty.handler.codec.DecoderException: com.datastax.driver.core.exceptions.DriverInternalError: Unknown response opcode ??)
2017-11-11 19:03:30 +0100 2 Defuncting Connection[/??.???.??.??:????-???, inFlight=?, closed=false] because: [/??.???.??.??:????] Unexpected exception triggered (io.netty.handler.codec.DecoderException: com.datastax.driver.core.exceptions.DriverInternalError: Unknown response opcode ??)
2017-11-11 19:03:30 +0100 401 [/??.???.??.??:????] Connection[/??.???.??.??:????-???, inFlight=?, closed=false] failed, remaining = ???
2017-11-11 19:03:33 +0100 1 Defuncting Connection[/??.???.??.??:????-???, inFlight=???, closed=false] because: [/??.???.??.??:????] Unexpected exception triggered (io.netty.handler.codec.DecoderException: com.datastax.driver.core.exceptions.DriverInternalError: Unknown response opcode ???)
2017-11-11 19:03:41 +0100 722 Defuncting Connection[/??.???.??.??:????-???, inFlight=?, closed=false] because: [/??.???.??.??:????] Heartbeat query timed out
2017-11-11 19:03:41 +0100 8 [/??.???.??.??:????] Connection[/??.???.??.??:????-?, inFlight=?, closed=false] failed, remaining = ???
2017-11-11 19:03:41 +0100 11 Defuncting Connection[/??.???.??.??:????-?, inFlight=?, closed=false] because: [/??.???.??.??:????] Heartbeat query timed out
2017-11-11 19:03:41 +0100 67 [/??.???.??.??:????] Connection[/??.???.??.??:????-??, inFlight=?, closed=false] failed, remaining = ???
2017-11-11 19:03:41 +0100 115 Defuncting Connection[/??.???.??.??:????-??, inFlight=?, closed=false] because: [/??.???.??.??:????] Heartbeat query timed out
2017-11-11 19:03:44 +0100 2 Defuncting Connection[/??.???.??.??:????-???, inFlight=??, closed=false] because: [/??.???.??.??:????] Unexpected exception triggered (io.netty.handler.codec.DecoderException: com.datastax.driver.core.exceptions.DriverInternalError: Unknown response opcode ?)
2017-11-11 19:03:51 +0100 2 Defuncting Connection[/??.???.??.??:????-???, inFlight=???, closed=false] because: [/??.???.??.??:????] Heartbeat query timed out
2017-11-11 19:03:51 +0100 265 Failed to post timeseries data Error Returned -
2017-11-11 19:03:57 +0100 3 Defuncting Connection[/??.???.??.??:????-???, inFlight=??, closed=false] because: [/??.???.??.??:????] Heartbeat query timed out
2017-11-11 19:04:01 +0100 39 Defuncting Connection[/??.???.??.??:????-???, inFlight=?, closed=false] because: [/??.???.??.??:????] Operation timed out
2017-11-11 19:04:01 +0100 12 Error processing jobs: execution of statement failed:All host(s) tried for query failed (tried: /??.???.??.??:???? (com.datastax.driver.core.exceptions.DriverException: Timeout while trying to acquire available connection (you may want to increase the driver number of per-host connections)), /??.???.??.??:???? (com.datastax.driver.core.exceptions.DriverException: Timeout while trying to acquire available connection (you may want to increase the driver number of per-host connections)), /??.???.??.??:???? (com.datastax.driver.core.exceptions.DriverException: Timeout while trying to acquire available connection (you may want to increase the driver number of per-host connections)), /??.???.??.??:???? [only showing errors of first ? hosts, use getErrors() for more details])
有人知道根本原因是什么吗?
Defuncting Connection[/??.???.??.??:????-???, inFlight=???, closed=false] because: [/??.???.??.??:????] Unexpected exception triggered (io.netty.handler.codec.DecoderException: io.netty.util.internal.OutOfDirectMemoryError: failed to allocate ??????? byte(s) of direct memory (used: ???????, max: ????????))
你有一些记忆问题。只要存在这些问题,您就不能指望驱动程序能够正常工作。您还说您的应用程序在几个小时后停止工作。在我看来,您的应用程序似乎存在内存泄漏。
请检查您的应用程序使用的直接内存设置。确保有足够的内存可供驱动程序分配。 Cassandra 需要分配直接内存。在它无法分配内存的情况下,我已经看到类似的问题,即使它与内存相关,它也会被报告为 NoHostAvailableException。
我的 Cassandra 集群(4 节点集群)有问题。 Cassandra版本为2.2.9,驱动版本为3.0.3.
几个小时后(~ 3 小时),我在驱动程序日志中看到以下问题:
- OutOfDirectMemoryError(偶尔发生,大部分时间没有影响)
- 没有与整数版本匹配的协议版本
- 未知响应操作码
- 心跳查询超时
- 所有尝试查询的主机均失败 --> 无法再查询 Cassandra
Cassandra 集群是健康的,当我重新启动应用程序时,一切都会再次运行几个小时。
日志片段:
First Time Count Message
2017-11-11 19:03:03 +0100 51 [/??.???.??.??:????] preparing to open ? new connections, total = ???
2017-11-11 19:03:03 +0100 49 [/??.???.??.??:????] Connection[/??.???.??.??:????-???, inFlight=?, closed=false] Transport initialized, connection ready
2017-11-11 19:03:03 +0100 24 [/??.???.??.??:????] Connection[/??.???.??.??:????-???, inFlight=?, closed=true] closed, remaining = ???
2017-11-11 19:03:29 +0100 1 Defuncting Connection[/??.???.??.??:????-???, inFlight=???, closed=false] because: [/??.???.??.??:????] Unexpected exception triggered (io.netty.handler.codec.DecoderException: io.netty.util.internal.OutOfDirectMemoryError: failed to allocate ??????? byte(s) of direct memory (used: ???????, max: ????????))
2017-11-11 19:03:29 +0100 14 [/??.???.??.??:????] Connection[/??.???.??.??:????-???, inFlight=???, closed=false] failed, remaining = ???
2017-11-11 19:03:29 +0100 7 [/??.???.??.??:????] Connection[/??.???.??.??:????-???, inFlight=??, closed=false] failed, remaining = ???
2017-11-11 19:03:29 +0100 1 Defuncting Connection[/??.???.??.??:????-???, inFlight=??, closed=false] because: [/??.???.??.??:????] Unexpected exception triggered (io.netty.handler.codec.DecoderException: java.lang.IllegalArgumentException: No protocol version matching integer version ?)
2017-11-11 19:03:29 +0100 5 Defuncting Connection[/??.???.??.??:????-???, inFlight=??, closed=false] because: [/??.???.??.??:????] Unexpected exception triggered (io.netty.handler.codec.DecoderException: com.datastax.driver.core.exceptions.DriverInternalError: Unknown response opcode ??)
2017-11-11 19:03:29 +0100 4 Defuncting Connection[/??.???.??.??:????-???, inFlight=???, closed=false] because: [/??.???.??.??:????] Unexpected exception triggered (io.netty.handler.codec.DecoderException: com.datastax.driver.core.exceptions.DriverInternalError: Unknown response opcode ?)
2017-11-11 19:03:29 +0100 3 Defuncting Connection[/??.???.??.??:????-???, inFlight=???, closed=false] because: [/??.???.??.??:????] Unexpected exception triggered (io.netty.handler.codec.DecoderException: com.datastax.driver.core.exceptions.DriverInternalError: Unknown response opcode -???)
2017-11-11 19:03:30 +0100 3 Defuncting Connection[/??.???.??.??:????-???, inFlight=???, closed=false] because: [/??.???.??.??:????] Unexpected exception triggered (io.netty.handler.codec.DecoderException: com.datastax.driver.core.exceptions.DriverInternalError: Unknown response opcode ??)
2017-11-11 19:03:30 +0100 2 Defuncting Connection[/??.???.??.??:????-???, inFlight=?, closed=false] because: [/??.???.??.??:????] Unexpected exception triggered (io.netty.handler.codec.DecoderException: com.datastax.driver.core.exceptions.DriverInternalError: Unknown response opcode ??)
2017-11-11 19:03:30 +0100 401 [/??.???.??.??:????] Connection[/??.???.??.??:????-???, inFlight=?, closed=false] failed, remaining = ???
2017-11-11 19:03:33 +0100 1 Defuncting Connection[/??.???.??.??:????-???, inFlight=???, closed=false] because: [/??.???.??.??:????] Unexpected exception triggered (io.netty.handler.codec.DecoderException: com.datastax.driver.core.exceptions.DriverInternalError: Unknown response opcode ???)
2017-11-11 19:03:41 +0100 722 Defuncting Connection[/??.???.??.??:????-???, inFlight=?, closed=false] because: [/??.???.??.??:????] Heartbeat query timed out
2017-11-11 19:03:41 +0100 8 [/??.???.??.??:????] Connection[/??.???.??.??:????-?, inFlight=?, closed=false] failed, remaining = ???
2017-11-11 19:03:41 +0100 11 Defuncting Connection[/??.???.??.??:????-?, inFlight=?, closed=false] because: [/??.???.??.??:????] Heartbeat query timed out
2017-11-11 19:03:41 +0100 67 [/??.???.??.??:????] Connection[/??.???.??.??:????-??, inFlight=?, closed=false] failed, remaining = ???
2017-11-11 19:03:41 +0100 115 Defuncting Connection[/??.???.??.??:????-??, inFlight=?, closed=false] because: [/??.???.??.??:????] Heartbeat query timed out
2017-11-11 19:03:44 +0100 2 Defuncting Connection[/??.???.??.??:????-???, inFlight=??, closed=false] because: [/??.???.??.??:????] Unexpected exception triggered (io.netty.handler.codec.DecoderException: com.datastax.driver.core.exceptions.DriverInternalError: Unknown response opcode ?)
2017-11-11 19:03:51 +0100 2 Defuncting Connection[/??.???.??.??:????-???, inFlight=???, closed=false] because: [/??.???.??.??:????] Heartbeat query timed out
2017-11-11 19:03:51 +0100 265 Failed to post timeseries data Error Returned -
2017-11-11 19:03:57 +0100 3 Defuncting Connection[/??.???.??.??:????-???, inFlight=??, closed=false] because: [/??.???.??.??:????] Heartbeat query timed out
2017-11-11 19:04:01 +0100 39 Defuncting Connection[/??.???.??.??:????-???, inFlight=?, closed=false] because: [/??.???.??.??:????] Operation timed out
2017-11-11 19:04:01 +0100 12 Error processing jobs: execution of statement failed:All host(s) tried for query failed (tried: /??.???.??.??:???? (com.datastax.driver.core.exceptions.DriverException: Timeout while trying to acquire available connection (you may want to increase the driver number of per-host connections)), /??.???.??.??:???? (com.datastax.driver.core.exceptions.DriverException: Timeout while trying to acquire available connection (you may want to increase the driver number of per-host connections)), /??.???.??.??:???? (com.datastax.driver.core.exceptions.DriverException: Timeout while trying to acquire available connection (you may want to increase the driver number of per-host connections)), /??.???.??.??:???? [only showing errors of first ? hosts, use getErrors() for more details])
有人知道根本原因是什么吗?
Defuncting Connection[/??.???.??.??:????-???, inFlight=???, closed=false] because: [/??.???.??.??:????] Unexpected exception triggered (io.netty.handler.codec.DecoderException: io.netty.util.internal.OutOfDirectMemoryError: failed to allocate ??????? byte(s) of direct memory (used: ???????, max: ????????))
你有一些记忆问题。只要存在这些问题,您就不能指望驱动程序能够正常工作。您还说您的应用程序在几个小时后停止工作。在我看来,您的应用程序似乎存在内存泄漏。
请检查您的应用程序使用的直接内存设置。确保有足够的内存可供驱动程序分配。 Cassandra 需要分配直接内存。在它无法分配内存的情况下,我已经看到类似的问题,即使它与内存相关,它也会被报告为 NoHostAvailableException。