尝试在 pandas 中创建 spark 数据框时出错

Getting Error while trying create a spark dataframe in pandas

我是 python 的新手,我在使用 pandas:

创建 Dataframe 时遇到问题
import pandas as pd

df = spark.createDataFrame([(66, "a", "4"), 
                            (67, "a", "0"), 
                            (70, "b", "4"), 
                            (71, "d", "4")],
                            ("id", "code", "amt"))


dfa = pd.DataFrame(data=df)

这是我遇到的错误 ValueError: DataFrame constructor not properly called!

错误信息是什么?您导入了 spark 还是只导入了 pandas?为什么你使用 spark 而不是 pandas 来创建数据框,就像

pd.DataFrame([(66, "a", "4"), 
              (67, "a", "0"), 
              (70, "b", "4"), 
              (71, "d", "4")], columns=("id", "code", "amt"))

它会为你制作一个数据框。

    id      code    amt
0   66      a       4
1   67      a       0
2   70      b       4
3   71      d       4
dfa = df.select("*").toPandas()

https://spark.apache.org/docs/latest/sql-pyspark-pandas-with-arrow.html#enabling-for-conversion-tofrom-pandas