仅删除负值的前导零

Remove leading zeros for negative values only

我有一个数据框,只需要为负值类型删除前导零,其余值相同。

例如

+-----------+-----------------+
| Input     |output           |
+-----------+-----------------+
| 0000-12.45|          -12.45 |
| 000012.45 |       000012.45 |
|    000$.00|          000$.00| 
|      0$   |            0$   |
|    0.     |            0.   |
|   51.46   |          51.46  | 
|   -123.67 |         -123.67 |
|  00012.45 |         00012.45| 
|  012.45   |         012.45  | 

下面的方法我试过了

spark.sql("""select regexp_replace("0000-12.45","^0+-(?!$)",'') as d,regexp_replace("000012.45","^0+-(?!$)",'') as d1,regexp_replace("0000.45","^0+-(?!$)",'') as d2,regexp_replace("0000$.00","^0+-(?!$)",'') as d3,regexp_replace("0.","^0+-(?!$)",'') as d4,regexp_replace("0$","^0+-(?!$)",'') as d5,regexp_replace("00","^0+-(?!$)",'') as d6,regexp_replace("51.46","^0+-(?!$)",'') as d7,regexp_replace("-12234.45","^0+-(?!$)",'') as d8, regexp_replace("0000-12234.45","^0+-(?!$)",'') as d9""").show()

+-----+---------+-------+--------+---+---+---+-----+---------+--------+
| d| d1| d2| d3| d4| d5| d6| d7| d8| d9|
+-----+---------+-------+--------+---+---+---+-----+---------+--------+
|12.45|000012.45|0000.45|0000$.00| 0.| 0$| 00|51.46|-12234.45|12234.45|
+-----+---------+-------+--------+---+---+---+-----+---------+--------+

您可以添加条件以仅在包含负号时删除零

(df
    .withColumn('ouput', F
        .when(F.col('input').contains('-'), F.regexp_replace('input', '^0+', ''))
        .otherwise(F.col('input'))
    )
    .show()
)

# +----------+---------+
# |     input|    ouput|
# +----------+---------+
# |0000-12.45|   -12.45|
# | 000012.45|000012.45|
# |   000$.00|  000$.00|
# |        0$|       0$|
# |        0.|       0.|
# |     51.46|    51.46|
# |   -123.67|  -123.67|
# |  00012.45| 00012.45|
# |    012.45|   012.45|
# +----------+---------+