使用 dataframe.withColumn 和变量似乎不起作用
Using dataframe.withColumn and a variable does not seem to work
我正在尝试编写如下语句:
配置文件 ID = "some value"
df.withColumn("ProfileId", col(profileId)) 并在 DataBricks 上收到 AnalysisException。据我所知,这应该有效,所以我想知道问题出在哪里。任何帮助将不胜感激。
尝试将 lit
与变量一起使用,例如
%py
from pyspark.sql.functions import col, expr, when, lit
df = sc.parallelize([
("orange", "apple"), ("kiwi", None), (None, "banana"),
("mango", "mango"), (None, None)
]).toDF(["fruit1", "fruit2"])
profileId = "some value"
display(df.withColumn("ProfileId", lit(profileId)))
我正在尝试编写如下语句: 配置文件 ID = "some value" df.withColumn("ProfileId", col(profileId)) 并在 DataBricks 上收到 AnalysisException。据我所知,这应该有效,所以我想知道问题出在哪里。任何帮助将不胜感激。
尝试将 lit
与变量一起使用,例如
%py
from pyspark.sql.functions import col, expr, when, lit
df = sc.parallelize([
("orange", "apple"), ("kiwi", None), (None, "banana"),
("mango", "mango"), (None, None)
]).toDF(["fruit1", "fruit2"])
profileId = "some value"
display(df.withColumn("ProfileId", lit(profileId)))