合并 pyspark 数据框中的列
Coalesce columns in pyspark dataframes
res=to.join(tc, to.id1 == tc.id,how='left').select(to.id1.alias('Employee_id'), tc.name.alias('Employee_Name'), to.dept.alias('Employee_Dept'))
res.show()
+-----------+-------------+-------------+
|Employee_id|Employee_Name|Employee_Dept|
+-----------+-------------+-------------+
| 12| Prad| Physics|
| 13| null| Chem|
| 14| null| Maths|
+-----------+-------------+-------------+
我想用 NONAME 替换 null。请指教select语法
尝试这样的事情:
df.withColumn("EmployeeNameNoNull",coalesce(df.Employee_Name,lit('NONAME'))).show()
res=to.join(tc, to.id1 == tc.id,how='left').select(to.id1.alias('Employee_id'), tc.name.alias('Employee_Name'), to.dept.alias('Employee_Dept')) res.show()
+-----------+-------------+-------------+
|Employee_id|Employee_Name|Employee_Dept|
+-----------+-------------+-------------+
| 12| Prad| Physics|
| 13| null| Chem|
| 14| null| Maths|
+-----------+-------------+-------------+
我想用 NONAME 替换 null。请指教select语法
尝试这样的事情:
df.withColumn("EmployeeNameNoNull",coalesce(df.Employee_Name,lit('NONAME'))).show()