取消旋转 Snowflake 中的多列

Question

我有一个 table，如下所示：

我需要按如下方式取消评级和评论：

在 Snowflake 中执行此操作的最佳方法是什么？

注意：评论栏中有部分单元格是NULL

添加详细信息：

create or replace table reviews(name varchar(50), acting_rating int, acting_comments text, comedy_rating int, comedy_comments text);

insert into reviews values
    ('abc', 4, NULL, 1, 'NO'),
    ('xyz', 3, 'some', 1, 'haha'),
    ('lmn', 1, 'what', 4, NULL);
    
    select * from reviews;
    


select name, skill, skill_rating, comments
    from reviews
    unpivot(skill_rating for skill in (acting_rating,  comedy_rating)) 
    unpivot(comments for skill_comments in (acting_comments,comedy_comments)) 

--Following where clause is added to filter the irrelevant comments due to multiple unpivots

where substr(skill,1,position('_',skill)-1) = substr(skill_comments,1,position('_',skill_comments)-1) 
     order by name;

将生成所需的结果，但对于具有 NULL 的数据，具有 NULL 的非透视行将从输出中丢失：

NAME    SKILL   SKILL_RATING    COMMENTS
abc COMEDY_RATING   1   NO
lmn ACTING_RATING   1   what
xyz ACTING_RATING   3   some
xyz COMEDY_RATING   1   haha

Answer 1

如果您只需要解决问题中指定的 table - 您可以使用一组 UNION ALL:

手动完成

select NAME
  , 'ACTING_RATING' as SKILL, ACTING_RATING as SKILL_RATING, ACTING_COMMENTS as SKILL_COMMENTS
from DATA
union all
select NAME
  , 'COMEDY_RATING', COMEDY_RATING, COMEDY_COMMENTS
from DATA
union all
select NAME
  , 'MUSICAL_PERFORMANCE_RATING', MUSICAL_PERFORMANCE_RATING, MUSICAL_PERFORMANCE_COMMENTS
from DATA

Answer 2

这是一个基本脚本，应该会提供所需的输出

create or replace table reviews(name varchar(50), acting_rating int, acting_comments text, comedy_rating int, comedy_comments text);

insert into reviews values
    ('abc', 4, 'something', 1, 'NO'),
    ('xyz', 3, 'some', 1, 'haha'),
    ('lmn', 1, 'what', 4, 'hahaha');
    
    select * from reviews;
    


select name, skill, skill_rating, comments
    from reviews
    unpivot(skill_rating for skill in (acting_rating,  comedy_rating)) 
    unpivot(comments for skill_comments in (acting_comments,comedy_comments)) 

--Following where clause is added to filter the irrelevant comments due to multiple unpivots

where substr(skill,1,position('_',skill)-1) = substr(skill_comments,1,position('_',skill_comments)-1) 
     order by name;

Answer 3

我遇到了同样的问题，这是我按两个类别取消透视并保留空值的解决方案：

首先用一些字符串替换 NULL，例如：'NULL'

然后将两个 unpivots 分成两个单独的 cte，并创建公共类别列以便稍后再次加入它们，'skill' 在您的情况下。

最后，按名称和技能类别加入两个 cte，将 'NULL' 字符串替换为实际的 NULL

create or replace table reviews(name varchar(50), acting_rating int, acting_comments text, comedy_rating int, comedy_comments text);

insert into reviews values
    ('abc', 4, 'something', 1, 'NO'),
    ('xyz', 3, 'some', 1, 'haha'),
    ('lmn', 1, 'what', 4, 'hahaha');

  WITH base AS (SELECT name
                     , acting_rating
                     , IFNULL(acting_comments, 'NULL') AS acting_comments
                     , comedy_rating
                     , IFNULL(comedy_comments, 'NULL') AS comedy_comments
                  FROM reviews
               )
     , skill_rating AS (SELECT name
                             , REPLACE(skill, '_RATING', '') AS skill
                             , skill_rating
                          FROM base
                              UNPIVOT (skill_rating FOR skill IN (acting_rating, comedy_rating))
                       )
     , comments AS (SELECT name
                         , REPLACE(skill_comments, '_COMMENTS', '') AS skill
                         , comments
                      FROM base
                          UNPIVOT (comments FOR skill_comments IN (acting_comments,comedy_comments))
                   )

SELECT s.name
     , s.skill
     , s.skill_rating
     , NULLIF(c.comments, 'NULL') AS comments
  FROM skill_rating AS s
  JOIN comments AS c
       ON s.name = c.name
           AND s.skill = c.skill
 ORDER BY name;

结果：

name    skill   skill_rating    comments
abc ACTING  4   <null>
abc COMEDY  1   NO
lmn ACTING  1   what
lmn COMEDY  4   <null>
xyz ACTING  3   some
xyz COMEDY  1   haha

Answer 4

如果目标是将逆透视结果存储为 table，则 INSERT ALL 可用于一次逆透视多个列：

设置：

create or replace table reviews(
     name varchar(50), acting_rating int,
     acting_comments text, comedy_rating int, comedy_comments text);

insert into reviews values
    ('abc', 4, NULL, 1, 'NO'),
    ('xyz', 3, 'some', 1, 'haha'),
    ('lmn', 1, 'what', 4, NULL);
    
select * from reviews;

查询：

CREATE OR REPLACE TABLE reviews_transposed(
    name VARCHAR(50)
    ,skill TEXT
    ,skill_rating INT
    ,skill_comments TEXT
);

INSERT ALL 
    INTO reviews_transposed(name, skill, skill_rating, skill_comments)
         VALUES (name, 'ACTING_RATING', acting_rating, acting_comments)
    INTO reviews_transposed(name, skill, skill_rating, skill_comments)
         VALUES (name, 'COMEDY_RATING', comedy_rating, comedy_comments)
SELECT *
FROM reviews;

SELECT *
FROM reviews_transposed;

之前：

之后：

与 Felippe 提出的方法相比，此方法有一个显着优势，当保存到 table 时（table 扫描的次数以及分区读取的次数对于每个 UNION ALL wheareas INSERT ALL 只扫描源 table 一次。

INSERT INTO reviews_transposed
select NAME
  , 'ACTING_RATING' as SKILL, ACTING_RATING as SKILL_RATING, ACTING_COMMENTS as SKILL_COMMENTS
from reviews
union all
select NAME
  , 'COMEDY_RATING', COMEDY_RATING, COMEDY_COMMENTS
from reviews;

与全部插入

取消旋转 Snowflake 中的多列

Unpivot multiple columns in Snowflake

sql

pivot

unpivot

snowflake-cloud-data-platform