在 EMR Jupyterhub Notebook 上的 Pyspark 内核中无法正常可视化
plotly visualization not working in Pyspark kernel on EMR Jupyterhub Notebook
我正在尝试在 EMR Jupyterhub Notebook 上使用 plotly 绘制图表,但这些图表并未在 Pyspark 内核中呈现。 (注意:Python 内核渲染图形很好)
我正在尝试的示例代码:
data_canada = px.data.gapminder().query("country == 'Canada'")
fig = px.bar(data_canada, x='year', y='pop')
fig.show()
I am able to plot a graph with %%display sparkmagic however I am not able to figure out if we can get plotly working with %%display sparkmagic -
import random
data = [('Person:%s' % i, i, random.randint(1, 5)) for i in range(1, 50)]
columns = ['Name', 'Age', 'Random']
spark_df = spark.createDataFrame(data, columns)
%%display
spark_df
有人试过成功吗?请指教
这是 sparkmagic 的局限性。您将不得不求助于 %%local
魔法。来自 sparkmagic docs.
Since all code is run on a remote driver through Livy, all structured data must
be serialized to JSON and parsed by the Sparkmagic library so that it can be
manipulated and visualized on the client side. In practice this means that you
must use Python for client-side data manipulation in %%local mode.
我正在尝试在 EMR Jupyterhub Notebook 上使用 plotly 绘制图表,但这些图表并未在 Pyspark 内核中呈现。 (注意:Python 内核渲染图形很好)
我正在尝试的示例代码:
data_canada = px.data.gapminder().query("country == 'Canada'")
fig = px.bar(data_canada, x='year', y='pop')
fig.show()
I am able to plot a graph with %%display sparkmagic however I am not able to figure out if we can get plotly working with %%display sparkmagic -
import random
data = [('Person:%s' % i, i, random.randint(1, 5)) for i in range(1, 50)]
columns = ['Name', 'Age', 'Random']
spark_df = spark.createDataFrame(data, columns)
%%display
spark_df
有人试过成功吗?请指教
这是 sparkmagic 的局限性。您将不得不求助于 %%local
魔法。来自 sparkmagic docs.
Since all code is run on a remote driver through Livy, all structured data must be serialized to JSON and parsed by the Sparkmagic library so that it can be manipulated and visualized on the client side. In practice this means that you must use Python for client-side data manipulation in %%local mode.