相关子查询?从不同的列中提取数据,相同 table

Correlated Subquery? pulling data from different columns, same table

我正在尝试使用多个条件从不同的列中提取数据,但无法弄清楚如何,我相信这是我需要的相关子查询并且已经尝试了几种不同的方法但无法弄清楚。

我希望获得迈阿密热火队在以下类别中获胜的平均值 + 纽约尼克斯队在以下类别中输球的平均值,并将它们合并为一个平均值。

所以这是我对 Heat 的查询,它准确地检索了我想要的

SELECT
    box_score.team_name, 
    ROUND(AVG(eFG),3) eFG,
    ROUND(AVG(OPP_eFG),3) OPP_eFG,
    ROUND(AVG(TOV_PCT),3) TOV_PCT,
    ROUND(AVG(OPP_TOV_PCT),3) OPP_TOV_PCT,
    ROUND(AVG(ORB_PCT),3) ORB_PCT,
    ROUND(AVG(DRB_PCT),3) DRB_PCT,
    ROUND(AVG(FTA_RATE),3) FTA_RATE,
    ROUND(AVG(OPP_FTA_RATE),3) OPP_FTA_RATE
FROM box_score
WHERE team_name = 'Miami Heat' AND WIN_LOSS = 'W' AND game_date < '2019-03-07' 

我也为尼克斯队输了球,这也导致了我想要的结果

WHERE team_name = 'New York Knicks' AND WIN_LOSS = 'L' AND game_date < '2019-03-07' 

我的问题是试图将两者合并为一个查询,在该查询中我获得热火队获胜的平均值和尼克斯队损失的平均值。所有这些信息都来自同一个 table,我可以从 ID 号或名称中获取团队信息...如果有任何变化,我正在使用 SQLite

这是 运行 查询的结果,我正在寻找一行数据的平均值...但是我想要热火队和尼克斯队的这些数字的平均值损失合并为一行

热火场均胜场数

eFG    OPP_eFG  TOV_PCT  OPP_TOV_PCT  ORB_PCT  DRB_PCT  FTA_RATE  OPP_FTA_RATE
0.603  0.505    0.14     0.126        0.28     0.77     0.235     0.141

这些是尼克斯输球的平均值

eFG    OPP_eFG  TOV_PCT  OPP_TOV_PCT  ORB_PCT  DRB_PCT  FTA_RATE  OPP_FTA_RATE
0.568  0.602    0.146    0.136        0.225    0.787    0.222     0.235

我想将每个类别的两者合并为 1 个平均值

但是有什么方法可以让我从单独的列中提取数据的平均值吗?

在这种情况下,我对迈阿密热火队感兴趣,所以我有上面的平均值,但我想做的是将热火队的平均值与尼克斯队的相应相反统计数据(eFG 应该与其他团队的 opp_eFG 相关联等等)...所以基本上我正在寻找以下平均值:

热火 eFG 和 OPP_eFG 尼克斯

热火 OPP_eFG 和 eFG 尼克斯

热火 TOV_PCT 和 OPP_TOV_PCT 尼克斯

热火 OPP_TOV_PCT 和 TOV_PCT 尼克斯

热火 FTA_RATE 和 OPP_FTA_RATE 尼克斯

热火 OPP_FTA_RATE 和 FTA_RATE 尼克斯

仍在寻找 1 行作为结果

一个解决方案是在单个 table 扫描中执行整个操作(没有连接或子查询),使用 条件聚合:

SELECT  
    box_score.team_name, 
    ROUND(AVG(CASE WHEN team_name = 'Miami Heat'      AND WIN_LOSS = 'W' THEN eFG          END),3) Heat_eFG,
    ROUND(AVG(CASE WHEN team_name = 'New York Knicks' AND WIN_LOSS = 'L' THEN eFG          END),3) Knicks_eFG,
    ROUND(AVG(CASE WHEN team_name = 'Miami Heat'      AND WIN_LOSS = 'W' THEN OPP_eFG      END),3) Heat_OPP_eFG,
    ROUND(AVG(CASE WHEN team_name = 'New York Knicks' AND WIN_LOSS = 'L' THEN OPP_eFG      END),3) Knicks_OPP_eFG,
    ROUND(AVG(CASE WHEN team_name = 'Miami Heat'      AND WIN_LOSS = 'W' THEN TOV_PCT      END),3) Heat_TOV_PCT,
    ROUND(AVG(CASE WHEN team_name = 'New York Knicks' AND WIN_LOSS = 'L' THEN TOV_PCT      END),3) Knicks_TOV_PCT,
    ROUND(AVG(CASE WHEN team_name = 'Miami Heat'      AND WIN_LOSS = 'W' THEN OPP_TOV_PCT  END),3) Heat_OPP_TOV_PCT,
    ROUND(AVG(CASE WHEN team_name = 'New York Knicks' AND WIN_LOSS = 'L' THEN OPP_TOV_PCT  END),3) Knicks_OPP_TOV_PCT,
    ROUND(AVG(CASE WHEN team_name = 'Miami Heat'      AND WIN_LOSS = 'W' THEN ORB_PCT      END),3) Heat_ORB_PCT,
    ROUND(AVG(CASE WHEN team_name = 'New York Knicks' AND WIN_LOSS = 'L' THEN ORB_PCT      END),3) Knicks_ORB_PCT,
    ROUND(AVG(CASE WHEN team_name = 'Miami Heat'      AND WIN_LOSS = 'W' THEN DRB_PCT      END),3) Heat_DRB_PCT,
    ROUND(AVG(CASE WHEN team_name = 'New York Knicks' AND WIN_LOSS = 'L' THEN DRB_PCT      END),3) Knicks_DRB_PCT,
    ROUND(AVG(CASE WHEN team_name = 'Miami Heat'      AND WIN_LOSS = 'W' THEN FTA_RATE     END),3) Heat_FTA_RATE,
    ROUND(AVG(CASE WHEN team_name = 'New York Knicks' AND WIN_LOSS = 'L' THEN FTA_RATE     END),3) Knicks_FTA_RATE,
    ROUND(AVG(CASE WHEN team_name = 'Miami Heat'      AND WIN_LOSS = 'W' THEN OPP_FTA_RATE END),3) Heat_OPP_FTA_RATE,
    ROUND(AVG(CASE WHEN team_name = 'New York Knicks' AND WIN_LOSS = 'L' THEN OPP_FTA_RATE END),3) Knicks_OPP_FTA_RATE
FROM box_score
WHERE team_name IN ('Miami Heat', 'New York Knicks') AND game_date < '2019-03-07' 

如果您要计算平均值,这里是查询的另一个版本,例如迈阿密获胜的 eFG 和纽约失败的 OPP_eFG 在一个列中。这仍然依赖于条件聚合。我还通过将条件移动到 WHERE 子句来稍微简化逻辑。

SELECT  
    box_score.team_name, 
    ROUND(AVG(CASE 
        WHEN team_name = 'Miami Heat'      THEN eFG 
        WHEN team_name = 'New York Knicks' THEN OPP_eFG 
    END, 3) Heats_eFG_Knicks_OPP_eFG, 
    ROUND(AVG(CASE 
        WHEN team_name = 'Miami Heat'      THEN OPP_eFG 
        WHEN team_name = 'New York Knicks' THEN eFG 
    END, 3) Heats_OPP_eFG_Knicks_eFG,
    ROUND(AVG(CASE 
        WHEN team_name = 'Miami Heat'      THEN TOV_PCT 
        WHEN team_name = 'New York Knicks' THEN OPP_TOV_PCT 
    END, 3) Heats_TOV_PCT_Knicks_OPP_TOV_PCT,
    ROUND(AVG(CASE 
        WHEN team_name = 'Miami Heat'      THEN OPP_TOV_PCT 
        WHEN team_name = 'New York Knicks' THEN TOV_PCT 
    END, 3) Heats_OPP_TOV_PCT_Knicks_TOV_PCT,
    ROUND(AVG(CASE 
        WHEN team_name = 'Miami Heat'      THEN FTA_RATE 
        WHEN team_name = 'New York Knicks' THEN OPP_FTA_RATE 
    END, 3) Heats_FTA_RATE_Knicks_OPP_FTA_RATE,
    ROUND(AVG(CASE 
        WHEN team_name = 'Miami Heat'      THEN OPP_FTA_RATE 
        WHEN team_name = 'New York Knicks' THEN FTA_RATE 
    END, 3) Heats_OPP_FTA_RATE_Knicks_FTA_RATE
FROM box_score
WHERE 
    game_date < '2019-03-07' 
    AND (
           ( team_name = 'Miami Heat'      AND win_loss = 'W' )
        OR ( team_name = 'New York Knicks' AND win_loss = 'L') 
    )

注意:正如 wildpasser 评论的那样,您可能希望在文字值周围使用单引号而不是双引号(这是 SQL 标准)。我全局把原查询中的双引号全部改为单引号。

此答案假设您想要 AVG(heat)-AVG(knicks),按照原始 post,而不是 AVG(heatsX OR knicksY)

我想为此推广通用 Table 表达式:

WITH selector_heat as (
SELECT
    box_score.team_name, 
    ROUND(AVG(eFG),3) eFG,
    ROUND(AVG(OPP_eFG),3) OPP_eFG,
    ROUND(AVG(TOV_PCT),3) TOV_PCT,
    ROUND(AVG(OPP_TOV_PCT),3) OPP_TOV_PCT,
    ROUND(AVG(ORB_PCT),3) ORB_PCT,
    ROUND(AVG(DRB_PCT),3) DRB_PCT,
    ROUND(AVG(FTA_RATE),3) FTA_RATE,
    ROUND(AVG(OPP_FTA_RATE),3) OPP_FTA_RATE
FROM box_score
WHERE team_name = 'Miami Heat' AND WIN_LOSS = 'W' AND game_date < '2019-03-07' 
)
, selector_knicks as (
...
)
select H.eFG - K.OPP_eFG as magic_nbr
from selector_heat H 
join selector_knicks K ON (1=1)

有关语法的更多详细信息:https://www.sqlite.org/lang_with.html 但暂时忽略 "recursive" 位,在这种情况下您不需要它们。

或者(接近角度略有不同)您可以使用 Window 子句聚合 "per team" 然后使用结果。 更多信息在这里:https://www.sqlite.org/windowfunctions.html#introduction_to_window_functions

示例:

SELECT  
  team_name, 
  WIN_LOSS,
  ROUND(AVG(eFG) OVER (partition by team_name, win_loss),3) as eFG
  ...
  from box_score
  where game_date < '2019-03-07'

通过此结果集,您可以获得所有团队和 win_loss 组合的平均值。 将其包装在 CTE 中并根据适合的条件加入自身,例如

WITH cte as (SELECT ...)
SELECT H.eFG - K.OPP_eFG as magic_nbr
FROM cte H join cte K 
  ON (H.team_name = 'Miami Heat' 
  AND K.team_name = 'NY Knicks'
  AND H.win_loss = 'W'
  AND K.win_loss = 'L')

如果要先计算平均值然后求平均值,可以使用两级聚合:

SELECT ROUND(AVG(eFG), 3) as eFG,
       ROUND(AVG(OPP_eFG), 3) as OPP_eFG,
       ROUND(AVG(TOV_PCT), 3) as TOV_PCT,
       ROUND(AVG(OPP_TOV_PCT), 3) as OPP_TOV_PCT,
       ROUND(AVG(ORB_PCT), 3) as ORB_PCT,
       ROUND(AVG(DRB_PCT), 3) as DRB_PCT,
       ROUND(AVG(FTA_RATE), 3) as FTA_RATE,
       ROUND(AVG(OPP_FTA_RATE), 3) as OPP_FTA_RATE
FROM (SELECT bs.team_name, 
             AVG(eFG) as eFG,
             AVG(OPP_eFG) as OPP_eFG,
             AVG(TOV_PCT) as TOV_PCT,
             AVG(OPP_TOV_PCT) as OPP_TOV_PCT,
             AVG(ORB_PCT) as ORB_PCT,
             AVG(DRB_PCT) as DRB_PCT,
             AVG(FTA_RATE) as FTA_RATE,
             AVG(OPP_FTA_RATE) as OPP_FTA_RATE
      FROM box_score bs
      WHERE game_date < '2019-03-07' AND
            ( (team_name = 'Miami Heat' AND WIN_LOSS = 'W') OR
              (team_name = 'New York Knicks' AND WIN_LOSS = 'L')
            )
     ) bs