需要帮助将列添加到一个 table 使用函数在两个单独的 table 的列之间进行算术运算

need help adding column to one table using function that does arithmetic operations between columns from two separate tables

我正在尝试使用 sequel-pro 将列 "wOBA" 添加到 MySQL 中的 table "starting_pitcher_stats"。下面是对 "starting_pitcher_stats" table 中的九个变量执行算术运算的函数代码。特别是,该函数收集多个变量的值,对其中一些变量(下面的分子)应用不同的权重(系数),并将该总和除以更多变量的加法和减法。所有这些变量都驻留在 "starting_pitcher_stats" table 中。算术运算用以下公式表示(系数是乘以下面分子中每个变量的值):

wOBA=(.69*walks_a + .72*HBP + .89*singles_a + 1.27*doubles_a + 1.62*triples_a+ 2.10*HR_a)/(at_bats+walks_a+SF+HBP)

每个重量因年份而异。每年的不同权重来自 table "GUTS".

我遇到的第一个难题是获取函数的正确代码。第二个是用于实际调用此函数的正确代码语法,并为每个 "Starting_Pitcher".

每年(赛季)的每场比赛使用正确的加权 wOBA 值填充新列

该函数是用下面的代码创建的,并在我的函数和过程列表中列为函数 "wOBA"。但是,sequel pro 中函数名称旁边的小 wheel/knob 由于某种原因显示为灰色。直到我找到调用它的正确代码,我才会知道是否有任何错误。

请询问我可以提供的更多信息以澄清任何问题。

提前致谢。

DELIMITER $$
    CREATE FUNCTION wOBA(wOBA DECIMAL(10,3))
    RETURNS DECIMAL(10,3)
    BEGIN
        DECLARE wOBA decimal(10,3);
        SET wOBA = 0;
        SELECT cast((SELECT SUM(weighted_BB) as wBB_sum 
            FROM (
                SELECT g.wBB*SUM(if(e.event_CD=14,1,0)) as weighted_BB 
                FROM events e 
                INNER JOIN GUTS g 
                    ON substring(e.game_ID,4,4)=g.season 
                WHERE PIT_ID=Starting_Pitcher 
                GROUP BY g.season) 
            as walks_a)  
            + (SELECT SUM(weighted_HBP) as wHBP_sum 
            FROM (
                SELECT g.wHBP*SUM(if(e.event_CD=16,1,0)) as weighted_HBP 
                FROM events e 
                INNER JOIN GUTS g 
                    ON substring(e.game_ID,4,4)=g.season 
                WHERE PIT_ID=Starting_Pitcher 
                GROUP BY g.season) 
            as HBP)     
            + (SELECT SUM(weighted_1B) as w1B_sum 
            FROM (
                SELECT g.w1B*SUM(if(e.event_CD=20,1,0)) as weighted_1B 
                FROM events e 
                INNER JOIN GUTS g 
                    ON substring(e.game_ID,4,4)=g.season 
                WHERE PIT_ID=Starting_Pitcher 
                GROUP BY g.season) 
            as singles_a)       
            + (SELECT SUM(weighted_2B) as w2B_sum 
            FROM ( 
                SELECT g.w2B*SUM(if(e.event_CD=21,1,0)) as weighted_2B 
                FROM events e 
                INNER JOIN GUTS g 
                    ON substring(e.game_ID,4,4)=g.season 
                WHERE PIT_ID=Starting_Pitcher 
                GROUP BY g.season) 
            as doubles_a)       
            + (SELECT SUM(weighted_3B) as w3B_sum 
            FROM (
                SELECT g.w3B*SUM(if(e.event_CD=22,1,0)) as weighted_3B 
                FROM events e 
                INNER JOIN GUTS g 
                    ON substring(e.game_ID,4,4)=g.season 
                WHERE PIT_ID=Starting_Pitcher 
                GROUP BY g.season) 
            as triples_a)       
            + (SELECT SUM(weighted_HR) as wHR_sum 
            FROM (
                SELECT g.wHR*SUM(if(e.event_CD=23,1,0)) as weighted_HR 
                FROM events e 
                INNER JOIN GUTS g 
                    ON substring(e.game_ID,4,4)=g.season 
                WHERE PIT_ID=Starting_Pitcher 
                GROUP BY g.season) 
            as HR_a) as decimal(10,3))
            /
            cast(SUM(if(e.ab_fl="T",1,0)) 
                + SUM(if(e.event_cd=14,1,0)) 
                + SUM(if(e.SF_fl="T",1,0)) 
                + SUM(if(e.event_cd=16,1,0)) as unsigned) INTO wOBA 
            FROM events e
            WHERE e.PIT_ID=Starting_Pitcher AND PIT_START_FL = "T"
            LIMIT 1;
        RETURN wOBA;
    END
    $$
    DELIMITER ;

Darwin,这是 events table 的两个屏幕截图。第一个是结构,第二个是一些内容(因为不是所有的内容都适合镜头):

[

以下是 GUTS table.

的结构和内容的屏幕截图

这是事件 table 结构的屏幕截图,显示了函数中的字段(及其定义):

更新:

UPDATE retrosheet.starting_pitcher_stats 
SET starting_pitcher_stats.wOBA =(SELECT
(
   (g.wBB * SUM(IF(e.event_cd = 14, 1, 0)))
   + (g.wHBP * SUM(IF(e.event_cd = 16, 1, 0)))
   + (g.w1B  * SUM(IF(e.event_cd = 20, 1, 0)))
   + (g.w2B  * SUM(IF(e.event_cd = 21, 1, 0)))
   + (g.w3B  * SUM(IF(e.event_cd = 22, 1, 0)))
   + (g.wHR  * SUM(IF(e.event_cd = 23, 1, 0)))
   )
   /
   (
     SUM(IF(e.ab_fl = 'T',   1, 0))
   + SUM(IF(e.event_cd = 14, 1, 0))
   + SUM(IF(e.sf_fl = 'T',   1, 0))
   + SUM(IF(e.event_cd = 16, 1, 0))
  ) AS wOBA
  FROM events AS e, GUTS AS g
  WHERE e.YEAR_ID = g.SEASON_ID
    AND e.PIT_START_FL= 'T'
    AND e.PIT_ID = Starting_Pitcher)

查询只保留运行ning。我会继续调整它。

更新: starting_pitcher_stats table 的屏幕截图

更新:

好的,我正在尝试创建一个 wOBA 列作为新 table 的一部分,其中包含 wOBA 其他组件的列。

但是,查询一直在进行。如何缩短 运行 时间?

DROP TABLE IF EXISTS starting_pitcher_wOBA;
CREATE TABLE starting_pitcher_wOBA 
SELECT
a.YEAR_ID
,
a.GAME_ID
,
a.PIT_ID
,
a.wBB
,
a.wHBP
,
a.w1B
,
a.w2B
,
a.w3B
,
a.wHR
,
a.u_walks_a
,
a.HBP
,
a.singles_a
,
a.doubles_a
,
a.triples_a
,
a.HR_a
,
a.at_bats
,
a.sacrifice_flies_a
,
a.wOBA
FROM
(
SELECT 
g.YEAR_ID
,
h.GAME_ID
,
e.PIT_ID
,
g.wBB
,
g.wHBP
,
g.w1B
,
g.w2B
,
g.w3B
,
g.wHR
,
SUM(IF(e.event_cd = 14, 1, 0)) AS u_walks_a
,
SUM(IF(e.event_cd = 16, 1, 0)) AS HBP
,
SUM(IF(e.event_cd = 20, 1, 0)) AS singles_a
,
SUM(IF(e.event_cd = 21, 1, 0)) AS doubles_a
,
SUM(IF(e.event_cd = 22, 1, 0)) AS triples_a
,
SUM(IF(e.event_cd = 23, 1, 0)) AS HR_a
,
SUM(IF(e.ab_fl = 'T',   1, 0)) AS at_bats
,
SUM(IF(e.sf_fl = 'T',   1, 0)) AS sacrifice_flies_a
,
(
(
   (g.wBB * SUM(IF(e.event_cd = 14, 1, 0))) 
   + (g.wHBP * SUM(IF(e.event_cd = 16, 1, 0))) 
   + (g.w1B  * SUM(IF(e.event_cd = 20, 1, 0))) 
   + (g.w2B  * SUM(IF(e.event_cd = 21, 1, 0))) 
   + (g.w3B  * SUM(IF(e.event_cd = 22, 1, 0))) 
   + (g.wHR  * SUM(IF(e.event_cd = 23, 1, 0))) 
   )
   /
   (
     SUM(IF(e.ab_fl = 'T',   1, 0)) 
   + SUM(IF(e.event_cd = 14, 1, 0)) 
   + SUM(IF(e.sf_fl = 'T',   1, 0)) 
   + SUM(IF(e.event_cd = 16, 1, 0)) 
  ) 
 )  AS wOBA
FROM events AS e, GUTS AS g, game AS h
WHERE e.PIT_START_FL= 'T' 
GROUP BY g.YEAR_ID, h.GAME_ID,e.PIT_ID) AS a
INNER JOIN GUTS AS g
ON 
a.YEAR_ID=g.YEAR_ID
INNER JOIN game AS h
ON
a.GAME_ID = h.GAME_ID
INNER JOIN events AS e
ON
a.PIT_ID = e.PIT_ID

我们将从清理查询开始。您应该尽可能尝试在每一行上执行计算,而不是执行多个垂直子查询,因为这可以避免 DBMS 对相同的 table.

进行多次传递
SELECT
  (
   ( (g.wbb  * SUM(IF(e.event_cd = 14, 1, 0)))
   + (g.whbp * SUM(IF(e.event_cd = 16, 1, 0)))
   + (g.w1b  * SUM(IF(e.event_cd = 20, 1, 0)))
   + (g.w2b  * SUM(IF(e.event_cd = 21, 1, 0)))
   + (g.w3b  * SUM(IF(e.event_cd = 22, 1, 0)))
   + (g.whr  * SUM(IF(e.event_cd = 23, 1, 0)))
   )
   /
   (
     SUM(IF(e.ab_fl = 'T',   1, 0))
   + SUM(IF(e.event_cd = 14, 1, 0))
   + SUM(IF(e.sf_fl = 'T',   1, 0))
   + SUM(IF(e.event_cd = 16, 1, 0))
   )
  ) AS woba
  FROM events e, guts g
  WHERE e.year_id = g.season_id
    AND e.pit_start_fl = 'T'
    AND e.pit_id = starting_pitcher
  GROUP BY g.season;

假设我没有在某处留下逗号,这将为指定的首发投手每年 return 一列 woba

请注意,我在 e.year_id 而不是 SUBSTRING(e.game_ID,4,4) 加入了 table;这避免了在每条记录上调用 SUBSTRING() 的开销。这种事情看似微不足道,但它可以在一个大 table.

上迅速加起来

这应该足以让您入门。