Cassandra datastax 驱动程序连接突然终止

Cassandra datastax driver connections getting terminated abruptly

我正在使用 com.datastax.cassandra:cassandra-driver-core:2.1.7.1 和 Cassandra 2.1.11。抛出以下异常,它似乎指向协议版本问题,但抛出 NPE 而不是 ProtocolException。

2016-01-26 17:46:29.426 TRACE - [launch worker-1] [74120143-3dc5-466d-8a71-68edbe03620d] com.datastax.driver.core.Connection      : Connection[/192.172.2.51:9042-1, inFlight=1, closed=false] writing request
 PREPARE SELECT * FROM loops WHERE venue_id = ? AND loop_state = ?  AND covers_year = ?  AND covers_month = ?  AND covers_day = 0 
2016-01-26 17:46:29.427 DEBUG - [r2-nio-worker-1] [74120143-3dc5-466d-8a71-68edbe03620d] com.datastax.driver.core.Connection      : Connection[/192.172.1.51:9042-3, inFlight=0, closed=false] connection error
java.lang.NullPointerException
        at com.datastax.driver.core.ProtocolOptions.getProtocolVersionEnum(ProtocolOptions.java:178)
        at com.datastax.driver.core.QueryLogger.protocolVersion(QueryLogger.java:753)
        at com.datastax.driver.core.QueryLogger.parameterValueAsString(QueryLogger.java:738)
        at com.datastax.driver.core.QueryLogger.appendParameters(QueryLogger.java:709)
        at com.datastax.driver.core.QueryLogger.logQuery(QueryLogger.java:647)
        at com.datastax.driver.core.QueryLogger.maybeLogNormalQuery(QueryLogger.java:631)
        at com.datastax.driver.core.QueryLogger$ConstantThresholdQueryLogger.maybeLogNormalOrSlowQuery(QueryLogger.java:278)
        at com.datastax.driver.core.QueryLogger.update(QueryLogger.java:620)
        at com.datastax.driver.core.Cluster$Manager.reportLatency(Cluster.java:1422)
        at com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:607)
        at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:991)
        at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:913)
        at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
        at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
        at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
        at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.epollInReady(EpollSocketChannel.java:722)
        at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:326)
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264)
        at io.netty.util.concurrent.SingleThreadEventExecutor.run(SingleThreadEventExecutor.java:116)
        at java.lang.Thread.run(Thread.java:745)
2016-01-26 17:46:29.427 TRACE - [r2-nio-worker-4] [74120143-3dc5-466d-8a71-68edbe03620d] com.datastax.driver.core.Connection      : Connection[/192.172.2.51:9042-1, inFlight=1, closed=false] request sent successfully
2016-01-26 17:46:29.430 DEBUG - [r2-nio-worker-1] [74120143-3dc5-466d-8a71-68edbe03620d] com.datastax.driver.core.Connection      : Defuncting connection to /192.172.1.51:9042
com.datastax.driver.core.TransportException: [/192.172.1.51:9042] Unexpected exception triggered (java.lang.NullPointerException)
        at com.datastax.driver.core.Connection$Dispatcher.exceptionCaught(Connection.java:1028)
        at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:271)
        at io.netty.channel.AbstractChannelHandlerContext.notifyHandlerException(AbstractChannelHandlerContext.java:768)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:335)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
        at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
        at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
        at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.epollInReady(EpollSocketChannel.java:722)
        at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:326)
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264)
        at io.netty.util.concurrent.SingleThreadEventExecutor.run(SingleThreadEventExecutor.java:116)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
        at com.datastax.driver.core.ProtocolOptions.getProtocolVersionEnum(ProtocolOptions.java:178)
        at com.datastax.driver.core.QueryLogger.protocolVersion(QueryLogger.java:753)
        at com.datastax.driver.core.QueryLogger.parameterValueAsString(QueryLogger.java:738)
        at com.datastax.driver.core.QueryLogger.appendParameters(QueryLogger.java:709)
        at com.datastax.driver.core.QueryLogger.logQuery(QueryLogger.java:647)
        at com.datastax.driver.core.QueryLogger.maybeLogNormalQuery(QueryLogger.java:631)
        at com.datastax.driver.core.QueryLogger$ConstantThresholdQueryLogger.maybeLogNormalOrSlowQuery(QueryLogger.java:278)
        at com.datastax.driver.core.QueryLogger.update(QueryLogger.java:620)
        at com.datastax.driver.core.Cluster$Manager.reportLatency(Cluster.java:1422)
        at com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:607)
        at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:991)
        at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:913)
        at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
        ... 16 more
2016-01-26 17:46:29.431 TRACE - [r2-nio-worker-4] [74120143-3dc5-466d-8a71-68edbe03620d] com.datastax.driver.core.Connection      : Connection[/192.172.2.51:9042-1, inFlight=1, closed=false] received: RESULT PREPARED 0xc66a551ecfc5ac34839e19fdfa0c5705 [venue_id (uuid)][loop_state (varchar)][covers_year (int)][covers_month (int)] (resultMetadata=[venue_id (uuid)][loop_state (varchar)][covers_year (int)][covers_month (int)][covers_day (int)][start (timestamp)][created_on (timestamp)][end (timestamp)][id (uuid)][iterations (int)][playlist (list<frozen<ums_qa."LoopMediaAsset">>)][slots (list<frozen<ums_qa."Slot">>)][updated_on (timestamp)])

两个 Cassandra 节点都工作正常并接受连接。一些驱动程序似乎是如何任意关闭连接的。无法在任何地方找到报告的类似问题。任何帮助将不胜感激。谢谢

2.1.8 和 2.1.9 之间本机协议版本略有增加 - 通过将 2.1.7 驱动程序与 2.1.11 服务器一起使用,服务器提供了 native-proto 版本号客户无法识别。

鉴于堆栈跟踪(以及驱动程序中这些行的代码 - https://github.com/datastax/java-driver/blob/f4240267b3a3b829fa51242441dd219424a91347/driver-core/src/main/java/com/datastax/driver/core/ProtocolOptions.java#L168-L179 ),我可能会首先升级到最新的 2.1.11+ cassandra-driver 版本来确定出去。

我们正在从 spark 集群的节点连接到 Cassandra,并使用初始化 Cassandra ClusterSession 的助手 class。

正如@OlivierMichallat 指出的那样,问题出在这个初始化中。似乎创建了多个集群,可能是因为 helper class 从 spark workers.This 加载了多次,这违反了 DataStax 建议的规则之一,即 use one Cluster instance per physical cluster。同步这个初始化为我们解决了这个问题。谢谢大佬指点。