Spark 中的 Oracle INSTR 等价物 SQL

Question

我试图复制 oracle Instr 函数，但在我看来，并不是所有的参数都存在于 Oracle 中。我收到此错误，我想将此转换包含在 table 的 "plataforma" 字段中，但我不能：

SELECT
SUBSTR(a.SOURCE, 0, INSTR(a.SOURCE, '-', 1, 2) - 1) AS plataforma,
COUNT(*) AS qtd
FROM db1.table AS as a
LEFT JOIN db1.table2 AS b ON a.ID=b.id
GROUP BY SUBSTR(a.SOURCE, 0, INSTR(a.SOURCE, '-', 1, 2) - 1)
ORDER BY qtd

The Apache Spark 2.0 database encountered an error while running this query. Error running query: org.apache.spark.sql.AnalysisException: Invalid number of arguments for function instr. Expected: 2; Found: 4; line 8 pos 45

我是这样转换字段的，但我不知道它是否正确：

如何在 Spark 中复制同一个 Oracle 函数？我需要这样做：

来源：

apache-spark-sql
sql-server-dw

结果：

apache-spark
sql-server

Answer 1

您要查找的是 substring_index 函数：

substring_index('apache-spark-sql', '-', 2)

它是 returns 出现 2 次 - 之前的子串。

我想你想获取最后一次出现 - 之前的子字符串。因此，您可以计算输入字符串中 - 的数量，并将其与 substring_index 函数结合使用，如下所示：

substring_index(col, '-', size(split(col, '-')) - 1)

其中 size(split(col, '-')) - 1 给出 - 的出现次数。

Spark 中的 Oracle INSTR 等价物 SQL

Oracle INSTR equivalent in Spark SQL

apache-spark

apache-spark-sql

looker