Teradata 转换为配置单元 sql 使用案例 when

Teradata conversion to hive sql using case when

我正在尝试将以下 teradata sql 转换为配置单元 sql 但我得到空 table.

TERADATA SQL:

 SELECT              
 TRIM(CAST(BOOK_ID AS BIGINT)) || '_' ||TRIM(CAST(REF_ID AS BIGINT)) 
 AS BOOK_REF,
 CASE WHEN (PHOTO_COUNT > 0) AND (INDEX(PICTURE_URL , ';')>0) THEN 
 SUBSTRING(PICTURE_URL FROM 1 FOR POSITION(';' IN PICTURE_URL)-1)
 ELSE PICTURE_URL
 END AS MAIN_IMAGE 
                     
 FROM GENERIC_BOOKS;

蜂巢:

 SELECT 
 CASE WHEN (PHOTO_COUNT > 0) AND (instr(A.PICTURE_URL, ';') > 0) THEN 
 SUBSTRING(A.PICTURE_URL, 1, FIND_IN_SET(';', A.PICTURE_URL))-1
    ELSE A.PICTURE_URL
    END AS ITEM_MAIN_IMAGE
 FROM GENERIC_BOOKS;

PICTURE_URL 例如: https://booking.com/00/s/OTAwWDE2A=/z/wKEAAOSwfURc~gng/$_57.JPG?set_id=8800005007;https://booking.com/00/s/OTAwW2MDA=/z/LQcAAOSwrzxc~gni/$_57.JPG?set_id=8800005007;https://booking.com/00/s/OTAwW2MDA=/z/XAIAAOSw7J1c~gnl/$_57.JPG?set_id=8800005007;https://booking.com/00/s/OTAwW2MDA=/z/aA8AAOSwYT1c~gnv/$_57.JPG?set_id=8800005007

对于这个例子,预期的 MAIN_IMAGE 应该是:https://booking.com/00/s/OTAwWDE2A=/z/wKEAAOSwfURc~gng/$_57.JPG?set_id=8800005007

Hive SQL 不支持标准 SQL POSITION 和 Teradata 的 INDEX(不知道为什么要使用这两个,不同的语法得到相同的结果)。

两者都可以用 LOCATE 代替。

另外标准 SQL SUBSTRING 也没有在 Hive 中实现,它是 SUBSTR:

CASE WHEN (PHOTO_COUNT >0) AND (locate(';', PICTURE_URL)>0) THEN 
 SUBSTR(PICTURE_URL, 1, locate(';', PICTURE_URL)-1)

以下查询可能是您提到的 teradata 查询的等效查询。

SELECT              
 concat(TRIM(BOOK_ID), '_', TRIM(REF_ID)) AS BOOK_REF,
 CASE WHEN (PHOTO_COUNT >0) AND (INSTR(PICTURE_URL,'\;')>0) THEN 
 SUBSTR(split(PICTURE_URL,'\;')[1],-1)
 ELSE PICTURE_URL
 END AS MAIN_IMAGE FROM GENERIC_BOOKS;

使用以下代码示例进行测试,

create table if not exists generic_books(book_id string, ref_id string, photo_count string, picture_url string) row format delimited fields terminated by ',' stored as textfile;

insert into table generic_books values ("1","1","1","AAA\;BBB"); 

insert into table generic_books values ("a","b","CDB","AAA\;BBB");

/*hive> select * from generic_books;
OK
1       1       CDB     AAA;BBB
a       b       CDB     AAA;BBB
1       1       1       AAA;BBB
*/

hive> SELECT
    >  concat(TRIM(BOOK_ID), '_', TRIM(REF_ID)) AS BOOK_REF,
    >  CASE WHEN (PHOTO_COUNT >0) AND (INSTR(PICTURE_URL,'\;')>0) THEN
    >  SUBSTR(split(PICTURE_URL,'\;')[0],-1)
    >  ELSE PICTURE_URL
    >  END AS MAIN_IMAGE FROM GENERIC_BOOKS;

/*OK
1_1     AAA;BBB
a_b     AAA;BBB
1_1     A
*/

注意:如果您可以在您的问题中分享查询的 teradata 输入和输出,我可以相应地更新答案。