特殊左连接

Special LEFT JOIN

我有以下 SQL (Impala) 伪查询,因为它不会以这种方式编译。有趣的部分是最后一部分,我想做的正是您可以阅读的内容。

我想做一个 LEFT JOIN 但如果没有匹配的 ProductId 我想使用一个特定的 ProductId(它是 NULL 并且假设只有一个但通过使用 LIMIT 1 来保证它)并执行 JOIN-就像连接一样,所以 CASE-WHEN 中的上述条件将正常工作。

所以基本上问题是,是否有办法将这个语法错误的查询转换为一个正确的查询?

我正在尝试不同的东西,例如使用ISNULL() 和 WITH,但是由于您可以在 ELSE 部分看到的子查询必须使用 2 个表才能正常工作,因此无论如何都无法编译,但我认为它可以工作。

SELECT 
    cd.CycleDataId AS CycleDataId,
    CASE   
        WHEN cd.CycleTime >= op.IdealValue AND cd.CycleTime <= op.MaxValue THEN NVL(dcl.ProductionLossTypeId, -1)
        WHEN cd.CycleTime >= op.IdealValue AND cd.CycleTime >= op.MaxValue THEN dcl.ProductionLossTypeId
    END AS Verdikt,
    CASE   
        WHEN cd.CycleTime >= op.IdealValue AND cd.CycleTime <= op.MaxValue THEN NVL(dcl.Time, cd.CycleTime - op.IdealValue)
        WHEN cd.CycleTime >= op.IdealValue AND cd.CycleTime >= op.MaxValue THEN dcl.Time  
    END AS Time
FROM CycleData cd
LEFT JOIN DistributedCycleLosses dcl ON dcl.CycleDataId = cd.CycleDataId
CASE   
WHEN IF EXISTS(SELECT * FROM Operation_parameter WHERE ProductId = cd.ProductId AND cd.Timestamp_ BETWEEN ValidFrom AND ValidTo) THEN LEFT JOIN Operation_parameter op ON op.ProductId = cd.ProductId AND cd.Timestamp_ BETWEEN op.ValidFrom AND op.ValidTo
ELSE (SELECT * FROM Operation_parameter WHERE ProductId IS NULL AND cd.Timestamp_ BETWEEN ValidFrom AND ValidTo LIMIT 1) AS op
END;

为整个实体设置默认值有点棘手,但对于字段列表,它可能会像这样实现:

SELECT 
    cd.CycleDataId AS CycleDataId, ISNull(op.[parameterName],'default parameter value') as [parameterName]
    CASE   
        WHEN cd.CycleTime >= op.IdealValue AND cd.CycleTime <= op.MaxValue THEN NVL(dcl.ProductionLossTypeId, -1)
        WHEN cd.CycleTime >= op.IdealValue AND cd.CycleTime >= op.MaxValue THEN dcl.ProductionLossTypeId
    END AS Verdikt,
    CASE   
        WHEN cd.CycleTime >= op.IdealValue AND cd.CycleTime <= op.MaxValue THEN NVL(dcl.Time, cd.CycleTime - op.IdealValue)
        WHEN cd.CycleTime >= op.IdealValue AND cd.CycleTime >= op.MaxValue THEN dcl.Time  
    END AS Time
FROM CycleData cd
LEFT JOIN DistributedCycleLosses dcl ON dcl.CycleDataId = cd.CycleDataId,
LEFT JOIN Operation_parameter op ON op.ProductId = cd.ProductId AND cd.Timestamp_ BETWEEN op.ValidFrom AND op.ValidTo

这基本上是使用默认值。我认为这符合您的要求:

SELECT cd.CycleDataId AS CycleDataId,
        (CASE WHEN cd.CycleTime >= COALESCE(op.IdealValue, opnull.IdealValue) AND 
                   cd.CycleTime <= COALESCE(op.MaxValue, opnull.MaxValue) 
              THEN COALESCE(dcl.ProductionLossTypeId, -1)
              WHEN cd.CycleTime >= COALESCE(op.IdealValue, opnull.IdealValue) AND
                   COALESCE(op.MaxValue, opnull.MaxValue)
              THEN dcl.ProductionLossTypeId
         END) AS Verdikt,
        (CASE WHEN cd.CycleTime >= COALESCE(op.IdealValue, opnull.IdealValue) AND
                   cd.CycleTime <= COALESCE(op.MaxValue, opnull.MaxValue)
              THEN cd.CycleTime >= COALESCE(dcl.Time, cd.CycleTime - COALESCE(op.IdealValue, opnull.IdealValue))
              WHEN cd.CycleTime >= COALESCE(op.IdealValue, opnull.IdealValue) AND
                   cd.CycleTime >= COALESCE(op.MaxValue, opnull.MaxValue)
              THEN dcl.Time  
         END) AS Time
FROM CycleData cd LEFT JOIN 
     DistributedCycleLosses dcl
     ON dcl.CycleDataId = cd.CycleDataId LEFT JOIN
     Operation_parameter op
     ON op.ProductId = cd.ProductId AND
        cd.Timestamp_ BETWEEN op.ValidFrom AND op.ValidTo LEFT JOIN
     Operation_parameter opnull
     ON op.ProductId IS NULL AND  -- no previous match
        opnull.ProductID IS NULL AND
        cd.Timestamp_ BETWEEN opnull.ValidFrom AND opnull.ValidTo ;

请注意,所有对 op 的引用都替换为 COALESCE() 表达式。

如果确实有必要,您可以修改它以处理多行以匹配 NULL 值。我认为逻辑中更重要的部分是 LEFT JOINs.