使用 Akka Streams 结合定界符从文本文件创建单词流

Combine delimiters to create stream of words from text file using Akka Streams

我有下一个计算文本文件中词频的代码:

implicit val system: ActorSystem = ActorSystem("words-count")
implicit val mat = ActorMaterializer()
implicit val ec: ExecutionContextExecutor = system.dispatcher

val sink = Sink.fold[Map[String, Int], String](Map.empty)({
    case (count, word) => count + (word -> (count.getOrElse(word, 0) + 1))
  })    

FileIO.fromPath(Paths.get("/file.txt"))
        .via(Framing.delimiter(ByteString(" "), 256, true).map(_.utf8String))
        .toMat(sink)((_, right) => right)
        .run()
        .map(println(_))
        .onComplete(_ => system.terminate())

目前,它使用 space 作为分隔符但忽略换行符 ("\n")。我可以在同一个流中同时使用 space 和换行符作为分隔符吗,即有没有办法将它们组合起来?

您可以将分隔符设置为 \n,然后使用 flatMapConcat:

按 space 拆分行
FileIO
    .fromPath(Paths.get("file.txt"))
    .via(Framing.delimiter(ByteString("\n"), 256, true).map(_.utf8String))
    .flatMapConcat(s => Source(s.split(" ").toList)) //split line by space 
    .toMat(sink)((_, right) => right)
    .run()
    .map(println(_))
    .onComplete(_ => system.terminate())