Spark SQL 相当于 Split_Part()

Spark SQL equivalent for Split_Part()

我正在尝试获得

的等价物
split_part(split_part(to_id, '_', 1), '|', 3) 

在 Spark 中 SQL

谁能帮忙

SELECT
to_id
,split(to_id,'_')[1] AS marketplace_id
,from_id
,split(split(to_id, '_')[0], '|')[2] AS asin
--,split(to_id, '|')[2] AS asin
FROM DDD

上下文:

to_id = ASIN|CATALOG|B0896YZABL_7 
expected = B0896YZABL 
current output = |

您必须像这样 \|.

转义竖线字符 |

下面是一个简单的Scala例子,你可以在交互式Scala中试一下shell:

val s = "ASIN|CATALOG|B0896YZABL_7"
val result = s.split("\|") // Will be Array(ASIN, CATALOG, B0896YZABL_7)
print(result.last) // Prints 'B0896YZABL_7'

在你的情况下,它应该是:

SELECT
to_id
,split(to_id,'_')[1] AS marketplace_id
,from_id
,split(split(to_id, '_')[0], '\|')[2] AS asin
--,split(to_id, '\|')[2] AS asin
FROM DDD