内部连接问题
Issue with Inner Join
我正在尝试使用内部联接将 3 table 联接在一起,但结果显示的记录多于应有的记录。我的数据 table 是这样设置的:
Table:gameday.atbats
GameName Inning num b s o Batter Pitcher Result
-----------------------------------------------------------------------------------------
gid_2008_09_24_cinmlb_houmlb_1 1 1 2 3 1 457803 150116 Jay Bruce strikes out swinging.
gid_2008_09_24_cinmlb_houmlb_1 1 2 1 0 2 433898 150116 Jeff Keppinger lines out to right fielder Hunter Pence.
gid_2008_09_24_cinmlb_houmlb_1 1 3 3 1 2 458015 150116 Joey Votto singles on a line drive to right fielder Hunter Pence.
gid_2008_09_24_cinmlb_houmlb_1 1 4 2 3 3 429665 150116 Edwin Encarnacion called out on strikes.
gid_2008_09_24_cinmlb_houmlb_1 1 5 1 2 0 430565 459371 Kazuo Matsui singles on a line drive to right fielder Jay Bruce.
-----------------------------------------------------------------------------------------
Table: Gameday.pitches
GameName GameAtBatID Result
------------------------------------------------------
gid_2008_09_24_cinmlb_houmlb_1 1 Called Strike
gid_2008_09_24_cinmlb_houmlb_1 1 Ball
gid_2008_09_24_cinmlb_houmlb_1 1 Swinging Strike
gid_2008_09_24_cinmlb_houmlb_1 1 Ball
gid_2008_09_24_cinmlb_houmlb_1 1 Foul
gid_2008_09_24_cinmlb_houmlb_1 1 Foul
gid_2008_09_24_cinmlb_houmlb_1 1 Swinging Strike
gid_2008_09_24_cinmlb_houmlb_1 2 Ball
gid_2008_09_24_cinmlb_houmlb_1 2 In play, out(s)
gid_2008_09_24_cinmlb_houmlb_1 3 Called Strike
gid_2008_09_24_cinmlb_houmlb_1 3 Ball
--------------------------------------------------------
Table:batters
GameName id name_display_first_last
----------------------------------------------------------------------------------
gid_2008_09_24_cinmlb_houmlb_1 407783 Geoff Geary
gid_2008_09_24_cinmlb_houmlb_1 209315 David Newhan
gid_2008_09_24_cinmlb_houmlb_1 115629 LaTroy Hawkins
gid_2008_09_24_cinmlb_houmlb_1 113889 Darin Erstad
gid_2008_09_24_cinmlb_houmlb_1 457803 Jay Bruce
gid_2008_09_24_cinmlb_houmlb_1 433898 Jeff Keppinger
gid_2008_09_24_cinmlb_houmlb_1 458015 Joey Votto
gid_2008_09_24_cinmlb_houmlb_1 429665 Edwin Encarnacion
---------------------------------------------------------------------------
我正在 运行 宁宁似乎是一组相当标准的内部连接,将各种 table 中的每一个连接在一起以获得一个输出,该输出显示每个击球手的音高整场比赛都做到了。我的代码如下:
SELECT
gameday.atbats.inning,
gameday.batters.name_display_first_last,
gameday.pitches.Result
FROM
gameday.atbats
Inner join
gameday.pitches on gameday.atbats.num = gameday.pitches.gameAtBatID
inner join
gameday.batters on gameday.atbats.batter = gameday.batters.ID
where gameday.atbats.gamename = "gid_2008_09_24_cinmlb_houmlb_1"
我的问题是,当我 运行 这个查询时,击球手得到的结果比他们应该得到的多。例如,在第一局 Batter jay Bruce (num 1 in the atbats table) 应该在第一局中投出 7 个球,但是当我 运行 查询他将投出 10 个球时给他。为了获得这些结果,我做错了什么。另外,我知道这些字段名的命名很糟糕,但它们是别人命名的,我还没有机会更改它们。
我敢打赌 atbats.num
和 pitches.GameAtBatID
并不意味着 全局 唯一地标识一个击球手,而是它们只是唯一地在给定游戏 中识别出击球手 。所以除了限制 atbats.GameName
到想要的游戏之外,你还需要指定 pitches.GameName = atbats.GameName
:
SELECT gameday.atbats.inning,
gameday.batters.name_display_first_last,
gameday.pitches.Result
FROM gameday.atbats
JOIN gameday.pitches
ON gameday.atbats.GameName = gameday.pitches.GameName
AND gameday.atbats.num = gameday.pitches.GameAtBatID
JOIN batters
ON gameday.atbats.GameName = gameday.batters.GameName
AND gameday.atbats.batter = gameday.batters.ID
WHERE gameday.atbats.gamename = 'gid_2008_09_24_cinmlb_houmlb_1'
(注意:我还为 batters
包含了类似的 AND
,因为虽然 batters.ID
的值足够大,但似乎确实 是 一个独特的字段,为了保持一致性,包含它是有意义的。)
没错,因为 SQL 从顶部到底部工作,所以当您加入前两个 table 时,您将拥有
Inner join
gameday.pitches on gameday.atbats.num = gameday.pitches.gameAtBatID
你会得到这些结果
GameName GameAtBatID Result Batter
--------------------------------------------------------------------------
gid_2008_09_24_cinmlb_houmlb_1 1 Called Strike 457803
gid_2008_09_24_cinmlb_houmlb_1 1 Ball 457803
gid_2008_09_24_cinmlb_houmlb_1 1 Swinging Strike 457803
gid_2008_09_24_cinmlb_houmlb_1 1 Ball 457803
gid_2008_09_24_cinmlb_houmlb_1 1 Foul 457803
gid_2008_09_24_cinmlb_houmlb_1 1 Foul 457803
gid_2008_09_24_cinmlb_houmlb_1 1 Swinging Strike 457803
gid_2008_09_24_cinmlb_houmlb_1 2 Ball 433898
gid_2008_09_24_cinmlb_houmlb_1 2 In play, out(s) 433898
gid_2008_09_24_cinmlb_houmlb_1 3 Called Strike 458015
gid_2008_09_24_cinmlb_houmlb_1 3 Ball 458015
然后当您添加新的连接行时
inner join
gameday.batters on gameday.atbats.batter = gameday.batters.ID
你将从三个 table
中得到这些结果
name_display_first_last GameAtBatID Result Batter
--------------------------------------------------------------------------
Jay Bruce 1 Called Strike 457803
Jay Bruce 1 Ball 457803
Jay Bruce 1 Swinging Strike 457803
Jay Bruce 1 Ball 457803
Jay Bruce 1 Foul 457803
Jay Bruce 1 Foul 457803
Jay Bruce 1 Swinging Strike 457803
Jeff Keppinger 2 Ball 433898
Jeff Keppinger 2 In play, out(s) 433898
David Newhan 3 Called Strike 458015
David Newhan 3 Ball 458015
我正在尝试使用内部联接将 3 table 联接在一起,但结果显示的记录多于应有的记录。我的数据 table 是这样设置的:
Table:gameday.atbats
GameName Inning num b s o Batter Pitcher Result
-----------------------------------------------------------------------------------------
gid_2008_09_24_cinmlb_houmlb_1 1 1 2 3 1 457803 150116 Jay Bruce strikes out swinging.
gid_2008_09_24_cinmlb_houmlb_1 1 2 1 0 2 433898 150116 Jeff Keppinger lines out to right fielder Hunter Pence.
gid_2008_09_24_cinmlb_houmlb_1 1 3 3 1 2 458015 150116 Joey Votto singles on a line drive to right fielder Hunter Pence.
gid_2008_09_24_cinmlb_houmlb_1 1 4 2 3 3 429665 150116 Edwin Encarnacion called out on strikes.
gid_2008_09_24_cinmlb_houmlb_1 1 5 1 2 0 430565 459371 Kazuo Matsui singles on a line drive to right fielder Jay Bruce.
-----------------------------------------------------------------------------------------
Table: Gameday.pitches
GameName GameAtBatID Result
------------------------------------------------------
gid_2008_09_24_cinmlb_houmlb_1 1 Called Strike
gid_2008_09_24_cinmlb_houmlb_1 1 Ball
gid_2008_09_24_cinmlb_houmlb_1 1 Swinging Strike
gid_2008_09_24_cinmlb_houmlb_1 1 Ball
gid_2008_09_24_cinmlb_houmlb_1 1 Foul
gid_2008_09_24_cinmlb_houmlb_1 1 Foul
gid_2008_09_24_cinmlb_houmlb_1 1 Swinging Strike
gid_2008_09_24_cinmlb_houmlb_1 2 Ball
gid_2008_09_24_cinmlb_houmlb_1 2 In play, out(s)
gid_2008_09_24_cinmlb_houmlb_1 3 Called Strike
gid_2008_09_24_cinmlb_houmlb_1 3 Ball
--------------------------------------------------------
Table:batters
GameName id name_display_first_last
----------------------------------------------------------------------------------
gid_2008_09_24_cinmlb_houmlb_1 407783 Geoff Geary
gid_2008_09_24_cinmlb_houmlb_1 209315 David Newhan
gid_2008_09_24_cinmlb_houmlb_1 115629 LaTroy Hawkins
gid_2008_09_24_cinmlb_houmlb_1 113889 Darin Erstad
gid_2008_09_24_cinmlb_houmlb_1 457803 Jay Bruce
gid_2008_09_24_cinmlb_houmlb_1 433898 Jeff Keppinger
gid_2008_09_24_cinmlb_houmlb_1 458015 Joey Votto
gid_2008_09_24_cinmlb_houmlb_1 429665 Edwin Encarnacion
---------------------------------------------------------------------------
我正在 运行 宁宁似乎是一组相当标准的内部连接,将各种 table 中的每一个连接在一起以获得一个输出,该输出显示每个击球手的音高整场比赛都做到了。我的代码如下:
SELECT
gameday.atbats.inning,
gameday.batters.name_display_first_last,
gameday.pitches.Result
FROM
gameday.atbats
Inner join
gameday.pitches on gameday.atbats.num = gameday.pitches.gameAtBatID
inner join
gameday.batters on gameday.atbats.batter = gameday.batters.ID
where gameday.atbats.gamename = "gid_2008_09_24_cinmlb_houmlb_1"
我的问题是,当我 运行 这个查询时,击球手得到的结果比他们应该得到的多。例如,在第一局 Batter jay Bruce (num 1 in the atbats table) 应该在第一局中投出 7 个球,但是当我 运行 查询他将投出 10 个球时给他。为了获得这些结果,我做错了什么。另外,我知道这些字段名的命名很糟糕,但它们是别人命名的,我还没有机会更改它们。
我敢打赌 atbats.num
和 pitches.GameAtBatID
并不意味着 全局 唯一地标识一个击球手,而是它们只是唯一地在给定游戏 中识别出击球手 。所以除了限制 atbats.GameName
到想要的游戏之外,你还需要指定 pitches.GameName = atbats.GameName
:
SELECT gameday.atbats.inning,
gameday.batters.name_display_first_last,
gameday.pitches.Result
FROM gameday.atbats
JOIN gameday.pitches
ON gameday.atbats.GameName = gameday.pitches.GameName
AND gameday.atbats.num = gameday.pitches.GameAtBatID
JOIN batters
ON gameday.atbats.GameName = gameday.batters.GameName
AND gameday.atbats.batter = gameday.batters.ID
WHERE gameday.atbats.gamename = 'gid_2008_09_24_cinmlb_houmlb_1'
(注意:我还为 batters
包含了类似的 AND
,因为虽然 batters.ID
的值足够大,但似乎确实 是 一个独特的字段,为了保持一致性,包含它是有意义的。)
没错,因为 SQL 从顶部到底部工作,所以当您加入前两个 table 时,您将拥有
Inner join
gameday.pitches on gameday.atbats.num = gameday.pitches.gameAtBatID
你会得到这些结果
GameName GameAtBatID Result Batter
--------------------------------------------------------------------------
gid_2008_09_24_cinmlb_houmlb_1 1 Called Strike 457803
gid_2008_09_24_cinmlb_houmlb_1 1 Ball 457803
gid_2008_09_24_cinmlb_houmlb_1 1 Swinging Strike 457803
gid_2008_09_24_cinmlb_houmlb_1 1 Ball 457803
gid_2008_09_24_cinmlb_houmlb_1 1 Foul 457803
gid_2008_09_24_cinmlb_houmlb_1 1 Foul 457803
gid_2008_09_24_cinmlb_houmlb_1 1 Swinging Strike 457803
gid_2008_09_24_cinmlb_houmlb_1 2 Ball 433898
gid_2008_09_24_cinmlb_houmlb_1 2 In play, out(s) 433898
gid_2008_09_24_cinmlb_houmlb_1 3 Called Strike 458015
gid_2008_09_24_cinmlb_houmlb_1 3 Ball 458015
然后当您添加新的连接行时
inner join
gameday.batters on gameday.atbats.batter = gameday.batters.ID
你将从三个 table
中得到这些结果name_display_first_last GameAtBatID Result Batter
--------------------------------------------------------------------------
Jay Bruce 1 Called Strike 457803
Jay Bruce 1 Ball 457803
Jay Bruce 1 Swinging Strike 457803
Jay Bruce 1 Ball 457803
Jay Bruce 1 Foul 457803
Jay Bruce 1 Foul 457803
Jay Bruce 1 Swinging Strike 457803
Jeff Keppinger 2 Ball 433898
Jeff Keppinger 2 In play, out(s) 433898
David Newhan 3 Called Strike 458015
David Newhan 3 Ball 458015