Spark Sql 正则表达式中包含数组 - 不起作用

Spark Sql Array contains on Regex - doesn't work

我有一个数据框如下

val data = Seq(
    """{"Data": [{ "name": "FName", "value": "Alex" }, { "name": "LName",   "value": "Johnson"  }]}""",
    """{"Data": [{ "name": "FName", "value": "Alexis" }, { "name": "LName",   "value": "Paul"  }]}""",
    """{"Data": [{ "name": "FName", "value": "Alexander" }, { "name": "LName",   "value": "Strong"  }]}""",
    """{"Data": [{ "name": "FName", "value": "Baron" }, { "name": "LName",   "value": "Corbin"  }]}""",
)
val df = spark.read.json(spark.sparkContext.parallelize(data))
df.createOrReplaceTempView("df")

架构如下

 root
 |-- Data: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- name: string (nullable = true)
 |    |    |-- value: string (nullable = true)

以上df的数据输出如下

Data
[{"name":"FName","value":"Alex"},{"name":"LName","value":"Johnson"}]
[{"name":"FName","value":"Alexis"},{"name":"LName","value":"Paul"}]
[{"name":"FName","value":"Alexander"},{"name":"LName","value":"Strong"}]
[{"name":"FName","value":"Baron"},{"name":"LName","value":"Corbin"}]

我需要 Fname 以 'Alex'

开头的所有记录

预期输出

Data
[{"name":"FName","value":"Alex"},{"name":"LName","value":"Johnson"}]
[{"name":"FName","value":"Alexis"},{"name":"LName","value":"Paul"}]
[{"name":"FName","value":"Alexander"},{"name":"LName","value":"Strong"}]

Spark SQL 查询 1:

select * from df where array_contains (Data.value, "Al%")

Spark SQL 查询 2:

select * from df where array_contains (Data.value, "Al*")

这两个查询的结果都是空的。

Spark SQL 查询 3:

select * from df where array_contains (Data.value, "Alex")

结果:

Data
[{"name":"FName","value":"Alex"},{"name":"LName","value":"Johnson"}]

如何在 array_contains 上点赞或正则表达式?

改用exists函数:

select * from df where exists(Data.value, x -> x like 'Al%')