如何在 Zeppelin 中将 Flink var 的内容写入屏幕?
How to write the content of a Flink var to screen in Zeppelin?
我尝试运行 Apache Zeppelin 中的以下简单命令。
%flink
var rabbit = env.fromElements(
"ARTHUR: What, behind the rabbit?",
"TIM: It is the rabbit!",
"ARTHUR: You silly sod! You got us all worked up!",
"TIM: Well, that's no ordinary rabbit. That's the most foul, cruel, and bad-tempered rodent you ever set eyes on.",
"ROBIN: You tit! I soiled my armor I was so scared!",
"TIM: Look, that rabbit's got a vicious streak a mile wide, it's a killer!")
var counts = rabbit.flatMap { _.toLowerCase.split("\W+")}.map{ (_,1)}.groupBy(0).sum(1)
counts.print()
我试着在笔记本上打印出结果。但不幸的是,我只得到以下输出。
rabbit: org.apache.flink.api.scala.DataSet[String] = org.apache.flink.api.scala.DataSet@37fdb65c
counts: org.apache.flink.api.scala.AggregateDataSet[(String, Int)] = org.apache.flink.api.scala.AggregateDataSet@1efc7158
res103: org.apache.flink.api.java.operators.DataSink[(String, Int)] = DataSink '<unnamed>' (Print to System.out)
如何在 Zeppelin 中将 counts 的内容溢出到笔记本中?
在 Zeppelin 中打印这种计算结果的方式是:
%flink
counts.collect().foreach(println(_))
//or one might prefer
//counts.collect foreach println
输出:
(a,3)
(all,1)
(and,1)
(armor,1)
...
观察到的行为的原因在于 Apache Zeppelin 和 Apache Flink 之间的相互作用。 Zeppelin 捕获 Console
的所有标准输出。但是,Flink 还会将输出打印到 System.out
,这正是您调用 counts.print()
时发生的情况。 bzz 的解决方案之所以有效,是因为它使用 Console
.
打印结果
我打开了一个 JIRA 问题 [1] 并打开了一个拉取请求 [2] 来更正此行为,以便您也可以使用 counts.print()
。
我尝试运行 Apache Zeppelin 中的以下简单命令。
%flink
var rabbit = env.fromElements(
"ARTHUR: What, behind the rabbit?",
"TIM: It is the rabbit!",
"ARTHUR: You silly sod! You got us all worked up!",
"TIM: Well, that's no ordinary rabbit. That's the most foul, cruel, and bad-tempered rodent you ever set eyes on.",
"ROBIN: You tit! I soiled my armor I was so scared!",
"TIM: Look, that rabbit's got a vicious streak a mile wide, it's a killer!")
var counts = rabbit.flatMap { _.toLowerCase.split("\W+")}.map{ (_,1)}.groupBy(0).sum(1)
counts.print()
我试着在笔记本上打印出结果。但不幸的是,我只得到以下输出。
rabbit: org.apache.flink.api.scala.DataSet[String] = org.apache.flink.api.scala.DataSet@37fdb65c
counts: org.apache.flink.api.scala.AggregateDataSet[(String, Int)] = org.apache.flink.api.scala.AggregateDataSet@1efc7158
res103: org.apache.flink.api.java.operators.DataSink[(String, Int)] = DataSink '<unnamed>' (Print to System.out)
如何在 Zeppelin 中将 counts 的内容溢出到笔记本中?
在 Zeppelin 中打印这种计算结果的方式是:
%flink
counts.collect().foreach(println(_))
//or one might prefer
//counts.collect foreach println
输出:
(a,3)
(all,1)
(and,1)
(armor,1)
...
观察到的行为的原因在于 Apache Zeppelin 和 Apache Flink 之间的相互作用。 Zeppelin 捕获 Console
的所有标准输出。但是,Flink 还会将输出打印到 System.out
,这正是您调用 counts.print()
时发生的情况。 bzz 的解决方案之所以有效,是因为它使用 Console
.
我打开了一个 JIRA 问题 [1] 并打开了一个拉取请求 [2] 来更正此行为,以便您也可以使用 counts.print()
。