java.sql.SQLException: other error: request outdated while connecting to TIDB from Spark using mysql-connector-java 5.1.6 connector

java.sql.SQLException: other error: request outdated while connecting to TIDB from Spark using mysql-connector-java 5.1.6 connector

使用 mysql-connector-java 5.1.6 connector 通过 Spark 连接到 TIDB 时出现以下错误。

请注意,我使用并行连接选项创建了 jdbc 连接,我们在其中指定了列名、下限、上限和分区数。

Spark 然后通过将列名的下限和上限分成相等的大小,将其分成(分区数)个查询。

java.sql.SQLException: other error: request outdated.
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1055)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:956)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3536)
at com.mysql.jdbc.MysqlIO.nextRowFast(MysqlIO.java:1551)
at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1407)
at com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:2861)
at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:474)
at com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:2554)
at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:1755)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2165)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2648)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2086)
at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:2237)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD.compute(JDBCRDD.scala:301)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD$$anonfun.apply(RDD.scala:337)
at org.apache.spark.rdd.RDD$$anonfun.apply(RDD.scala:335)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator.apply(BlockManager.scala:1092)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator.apply(BlockManager.scala:1083)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1018)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1083)
at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:809)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

other error: request outdated.TiKV抛出的错误,表示查询在执行前超过了超时限制end-point-request-max-handle-duration,被协处理器取消以避免查询过时。你可以在 TiKV 的 config 中配置它,它的默认值是 60 秒。

由于 Spark 从 JDBC 检索到此错误,这意味着请求太重,TiDB 无法处理,因此某些请求等待的时间太长。这主要是因为 Spark 将请求拆分为每个分区,导致 TiDB 的工作量很大。使用并联连接时情况更糟。

事实上,TiSpark是作为Spark与TiDB一起查询的解决方案开发的。它现在支持 Spark 2.1,几天后将支持 Spark 2.3。试试吧!