ForkJoinPool 用于并行处理

ForkJoinPool for parallel processing

我正在尝试 运行 运行 一些代码 100 万次。我最初使用 Threads 编写它,但这看起来很笨拙。我开始做更多的阅读,然后我遇到了 ForkJoin。这似乎正是我所需要的,但我不知道如何将下面的内容翻译成 "scala-style"。有人可以解释一下在我的代码中使用 ForkJoin 的最佳方法吗?

val l = (1 to 1000000) map {_.toLong}
println("running......be patient")
l.foreach{ x =>
    if(x % 10000 == 0) println("got to: "+x)
    val thread = new Thread {
        override def run { 
         //my code (API calls) here. writes to file if call success
        }
    }
}

最简单的方法是使用par(它会自动使用ForkJoinPool):

 val l = (1 to 1000000) map {_.toLong} toList

 l.par.foreach { x =>
    if(x % 10000 == 0) println("got to: " + x) //will be executed in parallel way
    //your code (API calls) here. will also be executed in parallel way (but in same thread with `println("got to: " + x)`)
 }

另一种方法是使用 Future:

import scala.concurrent._
import ExecutionContext.Implicits.global //import ForkJoinPool

val l = (1 to 1000000) map {_.toLong}

println("running......be patient")

l.foreach { x =>
    if(x % 10000 == 0) println("got to: "+x)
    Future {
       //your code (API calls) here. writes to file if call success
    }
}

如果你需要窃取工作 - 你应该用 scala.concurrent.blocking:

标记阻塞代码
Future {
   scala.concurrent.blocking {
      //blocking API call here
   }
}

它会告诉 ForkJoinPool 用新线程补偿阻塞的线程 - 这样您就可以避免线程饥饿(但有一些 )。

在 Scala 中,您可以使用 Future and Promise:

val l = (1 to 1000000) map {
  _.toLong
}
println("running......be patient")
l.foreach { x =>
  if (x % 10000 == 0) println("got to: " + x)
  Future{
    println(x)
  }
}