Slick 3.0 批量插入或更新(upsert)

Slick 3.0 bulk insert or update (upsert)

在 Slick 3.0 中执行批量 insertOrUpdate 的正确方法是什么?

我正在使用 MySQL,其中适当的查询是

INSERT INTO table (a,b,c) VALUES (1,2,3),(4,5,6)
ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);

MySQL bulk INSERT or UPDATE

这是我当前的代码,速度很慢:-(

// FIXME -- this is slow but will stop repeats, an insertOrUpdate
// functions for a list would be much better
val rowsInserted = rows.map {
  row => await(run(TableQuery[FooTable].insertOrUpdate(row)))
}.sum

我要找的相当于

def insertOrUpdate(values: Iterable[U]): DriverAction[MultiInsertResult, NoStream, Effect.Write]

正如您在 Slick examples 看到的那样,您可以使用 ++= 函数使用 JDBC 批量插入功能进行插入。每个实例:

val foos = TableQuery[FooTable]
val rows: Seq[Foo] = ...
foos ++= rows // here slick will use batch insert

您还可以 "size" 按 "grouping" 行顺序进行批处理:

val batchSize = 1000
rows.grouped(batchSize).foreach { group => foos ++= group }

有几种方法可以使此代码更快(每种方法 应该 比前面的更快,但它会逐渐变慢 idiomatic-slick):

  • 运行 insertOrUpdateAll 而不是 insertOrUpdate 如果 slick-pg 0.16.1+

    await(run(TableQuery[FooTable].insertOrUpdateAll rows)).sum
    
  • 运行 您的 DBIO 事件一次全部完成,而不是等待每个事件都提交,然后再 运行 下一个:

    val toBeInserted = rows.map { row => TableQuery[FooTable].insertOrUpdate(row) }
    val inOneGo = DBIO.sequence(toBeInserted)
    val dbioFuture = run(inOneGo)
    // Optionally, you can add a `.transactionally`
    // and / or `.withPinnedSession` here to pin all of these upserts
    // to the same transaction / connection
    // which *may* get you a little more speed:
    // val dbioFuture = run(inOneGo.transactionally)
    val rowsInserted = await(dbioFuture).sum
    
  • 下降到 JDBC 级别,然后 运行 一次性完成所有更新 (idea via this answer):

    val SQL = """INSERT INTO table (a,b,c) VALUES (?, ?, ?)
    ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);"""
    
    SimpleDBIO[List[Int]] { session =>
      val statement = session.connection.prepareStatement(SQL)
      rows.map { row =>
        statement.setInt(1, row.a)
        statement.setInt(2, row.b)
        statement.setInt(3, row.c)
        statement.addBatch()
      }
      statement.executeBatch()
    }
    

使用 sqlu

此演示作品

case ("insertOnDuplicateKey",answers:List[Answer])=>{
  def buildInsert(r: Answer): DBIO[Int] =
    sqlu"insert into answer (aid,bid,sbid,qid,ups,author,uid,nick,pub_time,content,good,hot,id,reply,pic,spider_time) values (${r.aid},${r.bid},${r.sbid},${r.qid},${r.ups},${r.author},${r.uid},${r.nick},${r.pub_time},${r.content},${r.good},${r.hot},${r.id},${r.reply},${r.pic},${r.spider_time}) ON DUPLICATE KEY UPDATE `aid`=values(aid),`bid`=values(bid),`sbid`=values(sbid),`qid`=values(qid),`ups`=values(ups),`author`=values(author),`uid`=values(uid),`nick`=values(nick),`pub_time`=values(pub_time),`content`=values(content),`good`=values(good),`hot`=values(hot),`id`=values(id),`reply`=values(reply),`pic`=values(pic),`spider_time`=values(spider_time)"
  val inserts: Seq[DBIO[Int]] = answers.map(buildInsert)
  val combined: DBIO[Seq[Int]] = DBIO.sequence(inserts)
  DEST_DB.run(combined).onComplete(data=>{
    println("insertOnDuplicateKey data result",data.get.mkString)
    if (data.isSuccess){
      println(data.get)
      val lastid=answers.last.id
      Sync.lastActor !("upsert",tablename,lastid)
    }else{
      //retry
      self !("insertOnDuplicateKey",answers)
    }
  })
}

我尝试在单个 sql 中使用 sqlu 但可能出错 sqlu 不提供字符串插值

这个演示不工作

case ("insertOnDuplicateKeyError",answers:List[Answer])=>{
  def buildSql(execpre:String,values: String,execafter:String): DBIO[Int] = sqlu"$execpre $values $execafter"
  val execpre="insert into answer (aid,bid,sbid,qid,ups,author,uid,nick,pub_time,content,good,hot,id,reply,pic,spider_time)  values "
  val execafter=" ON DUPLICATE KEY UPDATE  `aid`=values(aid),`bid`=values(bid),`sbid`=values(sbid),`qid`=values(qid),`ups`=values(ups),`author`=values(author),`uid`=values(uid),`nick`=values(nick),`pub_time`=values(pub_time),`content`=values(content),`good`=values(good),`hot`=values(hot),`id`=values(id),`reply`=values(reply),`pic`=values(pic),`spider_time`=values(spider_time)"
  val valuesstr=answers.map(row=>("("+List(row.aid,row.bid,row.sbid,row.qid,row.ups,"'"+row.author+"'","'"+row.uid+"'","'"+row.nick+"'","'"+row.pub_time+"'","'"+row.content+"'",row.good,row.hot,row.id,row.reply,row.pic,"'"+row.spider_time+"'").mkString(",")+")")).mkString(",\n")
  val insertOrUpdateAction=DBIO.seq(
    buildSql(execpre,valuesstr,execafter)
  )
  DEST_DB.run(insertOrUpdateAction).onComplete(data=>{
    if (data.isSuccess){
      println("insertOnDuplicateKey data result",data)
      //retry
      val lastid=answers.last.id
      Sync.lastActor !("upsert",tablename,lastid)
    }else{
      self !("insertOnDuplicateKey2",answers)
    }
  })
}

a mysql 带有 scala slick 的同步工具 https://github.com/cclient/ScalaMysqlSync