Select 2 个定义行之间的数据

Question

我有一个H2数据库。我想计算关于我提供的数据的平均燃料使用量。问题是我得到的东西真的很乱。这是1辆车的油耗数据。

这是一些示例数据：

| Amount   | Date       | Start (km) | End (km) |
+----------+------------+------------+----------+
| 35.5     | 2012-02-02 | 65000      | null     |
| 36.7     | 2012-02-15 | null       | 66520    |
| 44.5     | 2012-02-18 | null       | null     |
| 33.8     | 2012-02-22 | 67000      | null     |
| 44.5     | 2013-01-22 | null       | null     |

为了首先计算平均燃料使用量，我正在计算 MIN（距离）和 MAX（距离）之间的差异，为此我有以下查询：

SELECT 
   CASEWHEN((MAX(start)-MAX(end))>0, MAX(start), MAX(end)) 
    - 
   IFNULL(MIN(start),0) 
FROM fuel;

对于下一步，我需要 SUM(Amount)，但我该怎么做，它只对 67000 到 65000 之间的行求和？

非常感谢任何帮助。

Answer 1

我会这样处理：

SELECT SUM([amount]) / SUM([end] - [start]) AS AverageFuelUsage
FROM [fuel]
WHERE [amount] IS NOT NULL
AND [start] IS NOT NULL
AND [end] IS NOT NULL

注意：这排除了很多数据（在您的示例数据中，所有数据）- 但这很重要。

如果你不知道一次旅行使用了多少燃料，那并不意味着没有使用燃料，所以默认为 0 是个坏主意；最好忽略这一行并依赖完整的数据。
不知起止，不知远近；同样，您不能假设为 0，因此请忽略此错误数据。

如果对于所有记录，您至少缺少一个字段，您可以使用下面的代码 - 但即使您的 1% 的记录有完整数据可供使用，我也不会设计它.

SELECT AVG([amount]) / ( AVG([end]) - AVG([start]) ) AS AverageFuelUsage
FROM [fuel]

这里的想法是，如果我们假设在一个大数据集上数据取平均值（即大多数人行进的距离相似，开始和结束读数也趋向于某个平均值），我们可以计算出每个数据的平均值。我不是统计学家，并且会以很多怀疑的态度对待这给出的任何结果，但是如果您只有不良数据可以使用并且需要结果，那么如果这可能是您可以获得的最好结果。

更新

根据评论中的讨论，如果您记录了每一次旅程并且所有读数都针对同一辆车，您可以找到带有 [start] 的第一个值，带有 [end] 的最后一个值，计算总距离在所有这些旅程中旅行，然后总结途中使用的所有燃料。

--ideally date is unique
--if not this tries to work out the sequence of journeys based on start/end odometer readings
--if they're both null and fall on the same day as the final [end] reading, assumes the null reading journey was prior to the [end] one
declare @fuel table ([amount] float, [date] date, [start] int, [end] int)
insert @fuel
  values ( 35.5     , '2012-02-02' , 65000      , null     )
        ,( 36.7     , '2012-02-15' , null       , 66520    )
        ,( 44.5     , '2012-02-18' , null       , null     )
        ,( 33.8     , '2012-02-22' , 67000      , null     )
        ,( 44.5     , '2013-01-22' , null       , null     )

select j1.[start]
, jn.[end]
, sum(f.[amount]) [amount]
, sum(f.[amount]) / (jn.[end] - j1.[start]) LitresPerKm
, (jn.[end] - j1.[start]) / sum(f.[amount])  kmsPerLitre

from
(
    select top 1 [amount], [date], [start], [end]
    from @fuel
    where [start] is not null
    order by [start]
) j1 --first journey
cross join
(   
    select top 1 [amount], [date], [start], [end]
    from @fuel
    where [end] is not null
    order by [end] desc
) jn --last journey
inner join @fuel f
on f.[date] >= j1.[date]
and (f.[end] <= j1.[start] or f.[end] is null) --in case multiple journeys on the same day & this is before our first start
and f.[date] <= jn.[date] 
and (f.start <= jn.[end] or f.[start] is null) --in case multiple journeys on the same day & this is after our last end
group by j1.[start],jn.[end]

Select 2 个定义行之间的数据

Select data between 2 defined rows

sql

h2