如何将数据框列与 pyspark 中的另一个数据框列进行比较?
How to compare dataframe column to another dataframe column inplace in pyspark?
# DataframeA and DataframeB match:
DataframeA:
col: Name "Ali", "Bilal", "Ahsan"
DataframeB:
col: Name "Ali", "Bilal", "Ahsan"
# DataframeC and DataframeD DO NOT match:
DataframeC:
col: Name "Ali", "Ahsan", "Bilal"
DataframeD:
col: Name "Ali", "Bilal", "Ahsan"
我想就地匹配列值,如有任何帮助,我们将不胜感激。
使用以下 Scala 代码作为参考并将其翻译成 python。根据您的 dataframe
姓名更新 val check
行。
scala> val w = Window.orderBy(lit(1))
scala> val check = dfA.withColumn("rn", row_number.over(w)).alias("A").join(dfB.withColumn("rn", row_number.over(w)).alias("B"), List("rn"),"left").withColumn("check", when(col("A.name") === col("B.name"), lit("match")).otherwise(lit("not match"))).select("check").distinct.count
scala> if (check == 1){
| println("matched")} else (println("not matched"))
在python中使用set
进行比较。
DataframeC.columns
-> ["Ali", "Ahsan", "Bilal"]
DataframeD.columns
-> ["Ali", "Bilal", "Ahsan"]
DataframeC.columns == DataframeD.columns
-> False
set(DataframeC.columns) == set(DataframeD.columns)
-> True
# DataframeA and DataframeB match:
DataframeA:
col: Name "Ali", "Bilal", "Ahsan"
DataframeB:
col: Name "Ali", "Bilal", "Ahsan"
# DataframeC and DataframeD DO NOT match:
DataframeC:
col: Name "Ali", "Ahsan", "Bilal"
DataframeD:
col: Name "Ali", "Bilal", "Ahsan"
我想就地匹配列值,如有任何帮助,我们将不胜感激。
使用以下 Scala 代码作为参考并将其翻译成 python。根据您的 dataframe
姓名更新 val check
行。
scala> val w = Window.orderBy(lit(1))
scala> val check = dfA.withColumn("rn", row_number.over(w)).alias("A").join(dfB.withColumn("rn", row_number.over(w)).alias("B"), List("rn"),"left").withColumn("check", when(col("A.name") === col("B.name"), lit("match")).otherwise(lit("not match"))).select("check").distinct.count
scala> if (check == 1){
| println("matched")} else (println("not matched"))
在python中使用set
进行比较。
DataframeC.columns
-> ["Ali", "Ahsan", "Bilal"]
DataframeD.columns
-> ["Ali", "Bilal", "Ahsan"]
DataframeC.columns == DataframeD.columns
-> False
set(DataframeC.columns) == set(DataframeD.columns)
-> True