MySQL 固定列数的数据透视表

MySQL Pivot data with fixed number of columns

以下是我的 SELECT 语句,它很好地转换了我的数据。

我的数据是这样的:

col_a | col_b | col_c | col_d   | Score
-------------------------------------
stuff | stuff | stuff | null    |  5
stuff | stuff | stuff | title_a |  3
stuff | stuff | stuff | title_x |  4

我当前的 Pivot 语句如下所示:

SELECT `col_a`, `col_b`, `col_c`,
    MAX(CASE `col_d` WHEN 'title_a' THEN `col_d` end) AS 'Title',
    MAX(CASE `col_d` WHEN 'title_a' THEN `score` end) AS 'Score'
    MAX(CASE `col_d` WHEN 'title_x' THEN `col_d` end) AS 'Title',
    MAX(CASE `col_d` WHEN 'title_x' THEN `score` end) AS 'Score'
    .....

这给了我以下结果:

col_a | col_b | col_c | Title   | Score | Title   | Score
---------------------------------------------------------
stuff | stuff | stuff | title_a |   3   | title_x |   4

我想做的是检查更多标题,但我只想在数据透视表中有四列。最多只会有 2 行需要旋转到上面的记录。但是 col_d 可以包含任何标题。

例如,我尝试了以下操作:

我的数据现在看起来像这样:

col_a | col_b | col_c | col_d    | Score
-------------------------------------
stuff | stuff | stuff | null     |  5
stuff | stuff | stuff | title_a  |  3
stuff | stuff | stuff | title_x  |  4
stuff | stuff | stuff | null     |  5
stuff | stuff | stuff | title_a  |  3
stuff | stuff | stuff | title_bx |  4

我的数据透视表现在看起来像这样:

SELECT `col_a`, `col_b`, `col_c`,
    MAX(CASE `col_d` WHEN 'title_a' THEN `col_d` end) AS 'Title',
    MAX(CASE `col_d` WHEN 'title_a' THEN `score` end) AS 'Score'
    MAX(CASE `col_d` WHEN 'title_x' THEN `col_d` end) AS 'Title',
    MAX(CASE `col_d` WHEN 'title_x' THEN `score` end) AS 'Score'
    MAX(CASE `col_d` WHEN 'title_bx' THEN `col_d` end) AS 'Second Title',
    MAX(CASE `col_d` WHEN 'title_bx' THEN `score` end) AS 'Score'
    .....

正如您所看到的,我尝试检查另一个标题,但这只给了我六列,其中 2 列为空,因为在这种情况下,两行包含 title_atitle_bx,所以中间两列填充有 null.

我希望从以上数据得到的输出是:

col_a | col_b | col_c | Title   | Score | Title    | Score
---------------------------------------------------------
stuff | stuff | stuff | title_a |   3   | title_x  |   4
stuff | stuff | stuff | title_a |   3   | title_bx |   4

所以我的问题是如何在 col_d 中检查多个可能的标题,并且只有 4 列。

如果我理解正确的话。你可以这样做:

SELECT `col_a`, `col_b`, `col_c`,
MAX(CASE WHEN `col_d` IN('title_a','title_x','title_bx') THEN `col_d` end) AS 'Title',
MAX(CASE WHEN `col_d` IN('title_a','title_x','title_bx') THEN `score` end) AS 'Score'
...

这有点混乱,因为 MySQL 没有窗口函数,并且您想在第一组 Title/Score 列中包含非常具体的值。您可以通过使用一些 user variablescol_d 不等于 title_a 的那些行创建行号来获得最终结果,然后将其连接回您的 table .

语法将类似于以下内容:

select a.col_a, a.col_b, a.col_c,
  max(case when a.col_d = 'title_a' then a.col_d end) title1,
  max(case when a.col_d = 'title_a' then a.score end) score1,
  max(case when na.col_d <> 'title_a' then na.col_d end) title2,
  max(case when na.col_d <> 'title_a' then na.score end) score2
from yourtable a
left join
(
  -- need to generate a row number value for the col_d rows
  -- that aren't equal to title_a
  select n.col_a, n.col_b, n.col_c, n.col_d,
    n.score,
    @num:=@num+1 rownum
  from yourtable n
  cross join
  (
    select @num:=0
  ) d
  where n.col_d <> 'title_a'
  order by  n.col_a, n.col_b, n.col_c, n.col_d
) na
  on a.col_a = na.col_a
  and a.col_b = na.col_b
  and a.col_c = na.col_c
  -- in the event you have more than 2 row only return 2
  and na.rownum <= 2  
where a.col_d = 'title_a'  
group by a.col_a, a.col_b, a.col_c, na.rownum;

参见 SQL Fiddle with Demo。这得到一个结果:

| COL_A | COL_B | COL_C |  TITLE1 | SCORE1 |   TITLE2 | SCORE2 |
|-------|-------|-------|---------|--------|----------|--------|
| stuff | stuff | stuff | title_a |      3 | title_bx |      4 |
| stuff | stuff | stuff | title_a |      3 |  title_x |      4 |

有人向我指出,如果您只有 2 个其他值,那么您可以简单地 JOIN 数据而不使用用户变量:

select distinct a.col_a, a.col_b, a.col_c,
  a.col_d title1,
  a.score score1,
  na.col_d title2,
  na.score score2
from yourtable a
left join
(
  select n.col_a, n.col_b, n.col_c, n.col_d,
    n.score
  from yourtable n
  where n.col_d <> 'title_a'
) na
  on a.col_a = na.col_a
  and a.col_b = na.col_b
  and a.col_c = na.col_c
where a.col_d = 'title_a';

参见 SQL Fiddle with Demo。这给出了相同的结果:

| COL_A | COL_B | COL_C |  TITLE1 | SCORE1 |   TITLE2 | SCORE2 |
|-------|-------|-------|---------|--------|----------|--------|
| stuff | stuff | stuff | title_a |      3 |  title_x |      4 |
| stuff | stuff | stuff | title_a |      3 | title_bx |      4 |

根据您在 col_acol_bcol_c 中实际拥有的数据,您可能需要更改此设置,但它应该会为您提供所需的结果。

更新: 根据您的评论,您不知道 col_d 列中的值,但您只需要将数据拆分为两个旋转列,过程变得复杂,因为 MySQL 没有窗口函数。如果有 NTILE 函数,这将非常容易。 NTILE 函数将行分配到特定数量的组中。在这种情况下,您的数据被分成两组。

我修改了 this blog by SO User, Quassnoi 中的代码以使用用户变量复制 NTILE 函数。这些变量用于创建 2 个东西,一个行号(在旋转期间使用)和 ntile 值。

代码修改为:

select 
  x.col_a,
  x.col_b,
  x.col_c,
  max(case when x.splitgroup = 1 then x.col_d end) as Title1,
  max(case when x.splitgroup = 1 then x.Score end) as Score1,
  max(case when x.splitgroup = 2 then x.col_d end) as Title2,
  max(case when x.splitgroup = 2 then x.Score end) as Score2
from
(
  select src.col_a, src.col_b, src.col_c, src.col_d, src.score,
    src.splitGroup,
    @row:=case when @prev=src.splitGroup then @row else 0 end +1 rownum,
    @prev:=src.splitGroup
  from
  (
    -- mimic NTILE function by splitting the total count of rows
    -- over the number of columns we want (2)
    select d.col_a, d.col_b, d.col_c, d.col_d, d.score, 
      FLOOR((@r * @n) / cnt) + 1 AS splitGroup
    from
    (
      select a.col_a, a.col_b, a.col_c, a.col_d, a.score, grp.cnt
      from yourtable a
      inner join 
      (
        select col_a, col_b, col_c, count(*) as cnt
        from yourtable
        where col_d is not null
        group by col_a, col_b, col_c
      ) grp
        on a.col_a = grp.col_a
        and a.col_b = grp.col_b
        and a.col_c = grp.col_c
      where a.col_d is not null
      order by a.col_a, a.col_b, a.col_c
    ) d
    cross join
    (
      -- @n is equal to the number of new pivoted columns we want
      select @n:=2, @group1:='N', @group2:='N', @group3:='N'
    ) v
    WHERE 
      CASE 
        WHEN @group1 <> col_a AND @group2<> col_b AND @group3 <> col_c 
          THEN @r := -1 
          ELSE 0 END IS NOT NULL
      AND (@r := @r + 1) IS NOT NULL
  ) src
  cross join
  (
    -- these vars are used to get the row number once the data is split
    -- this will be needed for the aggregate/group by on the final select
    select @row:=0, @prev:=1
  ) v2
  order by src.splitGroup
) x
group by x.col_a, x.col_b, x.col_c, x.rowNum;

参见 SQL Fiddle with Demo。这给出了结果:

| COL_A | COL_B | COL_C |   TITLE1 | SCORE1 |   TITLE2 | SCORE2 |
|-------|-------|-------|----------|--------|----------|--------|
| stuff | stuff | stuff |  title_a |      3 | title_tt |      1 |
| stuff | stuff | stuff | title_bx |      0 | title_qq |      1 |
| stuff | stuff | stuff |  title_x |      4 |  title_a |      8 |
| stuff | stuff | stuff | title_yy |      3 |  title_h |      4 |
| stuff | stuff | stuff |  title_a |      2 |  title_o |      6 |