MySQL 5.6 - DENSE_RANK 功能类似但没有排序依据

MySQL 5.6 - DENSE_RANK like functionality without Order By

我有一个 table 这样的:

+------+-----------+
|caseID|groupVarian|
+------+-----------+
|1     |A,B,C,D,E  |
+------+-----------+
|2     |A,B,N,O,P  |
+------+-----------+
|3     |A,B,N,O,P  |
+------+-----------+
|4     |A,B,C,D,F  |
+------+-----------+
|5     |A,B,C,D,E  |
+------+-----------+

我想要一个新列 nameVarian,这样相同的 groupVarian 值具有由 nameVarian 表示的相同排名(例如:v1、v2 等)。但是,分配给特定 groupVariannameVarian 值应按照 caseID 的顺序(按照它们在 table 中出现的顺序)。

输出应该是这样的:

+------+-----------+----------+
|caseID|groupVarian|namevarian
+------+-----------+----------+
|1     |A,B,C,D,E  |v1        |
+------+-----------+----------+
|2     |A,B,N,O,P  |v2        |
+------+-----------+----------+
|3     |A,B,N,O,P  |v2        |
+------+-----------+----------+
|4     |A,B,C,D,F  |v3        |
+------+-----------+----------+
|5     |A,B,C,D,E  |v1        |
+------+-----------+----------+

您可以使用 DENSE_RANK(MySQL 8.0):

SELECT *, CONCAT('v', DENSE_RANK() OVER(ORDER BY groupVarian)) AS namevarian
FROM tab
ORDER BY CaseID;

db<>fiddle demo

对于 MySQL 版本 < 8.0 ():

问题陈述看起来需要 DENSE_RANK functionality over groupVarian; however it is not. :

You appear to want them enumerated by the order they appear in the data.

假设您的 table 名称是 t(请根据您的代码相应地更改 table 和字段名称)。这是一个 approach utilizing session variables (for older versions of MySQL), giving the desired result (DB Fiddle):

SET @row_number = 0;
SELECT t3.caseID, 
       t3.groupVarian, 
       CONCAT('v', t2.num) AS nameVarian
FROM
  (
   SELECT 
     (@row_number:=@row_number + 1) AS num, 
     t1.groupVarian 
   FROM 
     (
      SELECT DISTINCT groupVarian 
      FROM t 
      ORDER BY caseID ASC 
     ) AS t1 
  ) AS t2 
INNER JOIN t AS t3 
  ON t3.groupVarian = t2.groupVarian 
ORDER BY t3.caseID ASC 

另外: 我之前尝试模拟 DENSE_RANK 功能,效果很好。尽管也可以稍微调整之前的查询以实现 DENSE_RANK 功能。但是,以下查询效率更高,因为它创建 lesser Derived tables,并避免 JOIN on groupVarian :

SET @row_number = 1;
SET @group_varian = '';

SELECT inner_nest.caseID, 
       inner_nest.groupVarian, 
       CONCAT('v', inner_nest.num) as nameVarian 
FROM (
        SELECT 
            caseID, 
            @row_number:=CASE
                           WHEN @group_varian = groupVarian THEN @row_number
                           ELSE @row_number + 1
                         END AS num, 
            @group_varian:=groupVarian as groupVarian 
        FROM
            t  
        ORDER BY groupVarian
     ) AS inner_nest 
ORDER BY inner_nest.caseID ASC 

基本上,您想枚举变体。如果你只想要一个数字,那么你可以使用最小 id:

select t.*, min_codeId as groupVariantId
from t join
     (select groupVariant, min(codeId) as min_codeId
      from t
      group by groupVariant
     ) g
     on t.groupVariant = g.groupVariant;

但这并不是您想要的。您似乎希望按照它们在数据中出现的顺序对它们进行枚举。为此,您需要变量。这有点棘手,但是:

select t.*, rn as groupVariantId
from t join
     (select g.*,
             (@rn := if(@gv = groupvariant, @gv,
                        if(@gv := groupvariant, @gv+1, @gv+1)
                       )
             ) as rn
      from (select groupVariant, min(codeId) as min_codeId
            from t
            group by groupVariant
            order by min(codeId)
           ) g cross join
           (select @gv := '', @rn := 0) params
     ) g
     on t.groupVariant = g.groupVariant;

使用变量很棘手。一个重要的考虑因素:MySQL 不保证 SELECT 中表达式的求值顺序。这意味着不应在一个表达式中分配一个变量,然后在另一个表达式中使用它们——因为它们可能以错误的顺序求值(另一个答案有这个错误)。

此外,order by 需要在子查询中进行。 MySQL 不保证变量赋值发生在排序之前。