优化 MySQL JOIN 与子查询

Optimize MySQL JOIN with subquery

我当前的 MySQL 查询非常慢。 我正在寻找优化它的方法。

我有一个 table Steam 游戏赠品(12.711 条记录)。

CREATE TABLE IF NOT EXISTS `giveaway` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `creatorid` bigint(20) unsigned NOT NULL,
  `created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `appid` int(10) DEFAULT NULL,
  `packageid` int(10) DEFAULT NULL,
  `gamename` varchar(256) NOT NULL,
  `gametype` varchar(64) NOT NULL,
  `copies` int(3) unsigned NOT NULL,
  `multiplier` float unsigned NOT NULL DEFAULT '1',
  `closed` timestamp NULL DEFAULT NULL,
  `winnerid` bigint(20) unsigned DEFAULT NULL,
  `cancelled` tinyint(1) unsigned DEFAULT NULL,
  `noentry` tinyint(1) unsigned DEFAULT NULL,
  `url` varchar(256) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `winnerid` (`winnerid`),
  KEY `creatorid` (`creatorid`),
  KEY `packageid` (`packageid`),
  KEY `appid` (`appid`),
  KEY `gametype` (`gametype`)
) ENGINE=InnoDB AUTO_INCREMENT=12003 DEFAULT CHARSET=utf8;

INSERT INTO `giveaway` (`id`, `creatorid`, `created`, `appid`, `packageid`, `gamename`, `gametype`, `copies`, `multiplier`, `closed`, `winnerid`, `cancelled`, `noentry`, `url`) VALUES
    (148, 76561198043198608, '2014-08-05 15:41:12', 72200, NULL, 'Universe Sandbox', 'bundle', 1, 1, '2014-08-05 15:41:29', 76561198051609534, NULL, NULL, 'http://www.steamgifts.com/giveaway/ZvyXL/universe-sandbox'),
    (149, 76561197993840952, '2014-08-05 15:41:49', 287860, NULL, '8-Bit Commando', 'bundle', 1, 1, '2014-08-12 18:54:15', 76561198001912161, NULL, NULL, 'http://www.steamgifts.com/giveaway/qlFrc/8-bit-commando'),
    (150, 76561198043198608, '2014-08-05 15:42:09', 212010, NULL, 'Galaxy on Fire 2 Full HD', 'bundle', 1, 1, '2014-08-05 15:43:17', 76561198031159289, NULL, NULL, 'http://www.steamgifts.com/giveaway/eyNT1/galaxy-on-fire-2-full-hd');

我还有一个 table 用户已经拥有的 Steam 游戏(130.117 条记录)。

CREATE TABLE IF NOT EXISTS `steam_ownedgames` (
  `steamid` bigint(20) unsigned NOT NULL,
  `appid` bigint(20) unsigned NOT NULL,
  `last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  KEY `steamid` (`steamid`),
  KEY `appid` (`appid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO `steam_ownedgames` (`steamid`, `appid`, `last_update`) VALUES
    (76561198112359563, 4000, '2015-05-20 18:03:10'),
    (76561198112359563, 2990, '2015-05-20 18:03:10'),
    (76561198112359563, 12200, '2015-05-20 18:03:10'),
    (76561198112359563, 12210, '2015-05-20 18:03:10'),
    (76561198112359563, 29800, '2015-05-20 18:03:10'),
    (76561198112359563, 9870, '2015-05-20 18:03:10'),
    (76561198112359563, 34900, '2015-05-20 18:03:10'),
    (76561198112359563, 11390, '2015-05-20 18:03:10'),
    (76561198112359563, 23490, '2015-05-20 18:03:10'),
    (76561198112359563, 18820, '2015-05-20 18:03:10'),
    (76561198112359563, 24960, '2015-05-20 18:03:10'),
    (76561198112359563, 43110, '2015-05-20 18:03:10'),
    (76561198112359563, 46410, '2015-05-20 18:03:10'),
    (76561198112359563, 50620, '2015-05-20 18:03:10'),
    (76561198112359563, 70300, '2015-05-20 18:03:10'),
    (76561198112359563, 63800, '2015-05-20 18:03:10');

我想显示所有赠品并能够表明用户是否已经拥有它。

我可以通过将 giveaway 的 appid 与当前用户的 steam_ownedgames 的 appid 进行比较来获取此信息。

输出示例:

giveawayid  owned   appid   gamename
12810       12810   264060  Full Bore
12809       \N      263100  9.03m
12808       12808   107200  Space Pirates and Zombies
12807       12807   231910  Leisure Suit Larry in the Land of the Lounge Lizards: Reloaded
12806       \N      278620  TinyKeep
12805       \N      315430  Polarity
12804       12804   341060  The Lady

我通过以下查询获得了我想要的信息。

SELECT
    giveaway.id as giveawayid,
    owned.id as owned,
    appid,
    gamename
FROM giveaway 
LEFT JOIN
    ( /*
        * query - giveaways with an appid that is in the user's owned games list
        */ 
        SELECT giveaway.id as id
        FROM giveaway
        LEFT JOIN steam_ownedgames
            ON giveaway.appid = steam_ownedgames.appid
        WHERE steam_ownedgames.steamid = 76561197962290563
        GROUP BY giveaway.id
    ) as owned
    ON giveaway.id = owned.id 
ORDER BY created DESC, giveaway.id DESC 
LIMIT 0, 500 

但是这个非常慢,需要 27 秒才能完成。而且随着数据库越来越大,时间也越来越快。

谁有更好的优化建议?

您不需要子选择。起初有点不直观,但如果您只想针对一个子集加入,这是执行此操作的标准方法。这是在抓取所有赠品记录并查找具有特定用户 (steamid) 的同一游戏 (appid) 的任何 steam_ownedgames 记录。

SELECT giveaway.id as giveawayid
   , appid, gamename
   , sog.last_update AS ownedSince
FROM giveaway 
LEFT JOIN steam_ownedgames AS sog
        ON giveaway.appid = sog.appid
        AND sog.steamid = 76561197962290563
ORDER BY created DESC, giveaway.id DESC 
LIMIT 0, 500 
;

它等同于此(下),但允许利用索引。 实际上,下面这个在某些情况下可能会更快,但它会高度依赖于表中的特定数据。示例:如果用户只有 3 个游戏,则未索引配对可能比 1000 个用户拥有的游戏的索引配对更快。

SELECT giveaway.id as giveawayid
   , appid, gamename
   , sog.last_update AS ownedSince
FROM giveaway 
LEFT JOIN (
   SELECT * 
   FROM steam_ownedgames 
   WHERE steamid = 76561197962290563
  ) AS sog
  ON giveaway.appid = sog.appid
ORDER BY created DESC, giveaway.id DESC 
LIMIT 0, 500 
;

在此示例中,如果 ownedSince 未被拥有,则它将为 null;但 steam_owned 游戏中的任何字段都应该足够(只要它通常不能为 null)。