查询性能改进 - 在 CakePHP 3.0 中收集 user_activities

Query performance improvements - collecting user_activities in CakePHP 3.0

我目前正在从事 cakephp 3.0 项目以创建个人社交网络。我正在处理 "timeline" 的问题。我有一个查询,它首先获取我在这个社区中的所有朋友的列表,然后收集他们所做的所有 user_activities。 (请参阅下面的查询)。

所以最后我有一个列表,其中包含我所有朋友的活动,这些活动按创建...在前端我使用 foreach 循环遍历它并为每个 "activity.type" 一个单独的视图模板(只是文字-post、图片-post 或视频-post)

对于 50-100 个好友,没问题...时间线需要 < 1 秒。正确显示...但是社区有多达 5000 个朋友的用户(我测试了约 400 个朋友,查询运行时间为 4-5 秒)。

有人知道我如何改进查询或者我可以做其他事情来提高性能吗(仅从 SQL 或 PHP 方面,...我已经设置了一些某些列的索引并考虑了日期限制,e.x。仅收集 - 1 年内的活动)。

我对这个话题不是很有经验,所以我很感激任何想法:)

-------- 编辑

收集好友列表

SELECT (CASE WHEN `Friends`.`user_id` = '6' THEN `FriendUsers`.`id` ELSE `Friends`.`user_id` END) AS `id` FROM `friends` `Friends` LEFT JOIN `users` `FriendUsers` ON `FriendUsers`.`id` = (`Friends`.`friend_id`) WHERE (`status` >= 0 AND (`Friends`.`user_id` = 6 OR `Friends`.`friend_id` = 6))

收集用户活动的查询

SELECT  `Wallposts`.`text` AS `Wallposts__text`, `Activities`.`name` AS `Activities__name`,
        `Users`.`firstname` AS `Users__firstname`, `Users`.`lastname` AS `Users__lastname`,
        `Users`.`username` AS `Users__username`, `Users`.`id` AS `Users__id`,
        `UsersActivities`.`created` AS `UsersActivities__created`,
        `UsersActivities`.`id` AS `UsersActivities__id`, `UsersActivities`.`video_id` AS `UsersActivities__video_id`,
        `UsersActivities`.`picture_id` AS `UsersActivities__picture_id`,
        `UsersActivities`.`visible_for_group` AS `UsersActivities__visible_for_group`,
        `Pictures`.`id` AS `Pictures__id`, `Pictures`.`user_id` AS `Pictures__user_id`,
        `Pictures`.`description` AS `Pictures__description`, `Pictures`.`fileName` AS `Pictures__fileName`,
        `Pictures`.`album_id` AS `Pictures__album_id`, `Pictures`.`isChosen` AS `Pictures__isChosen`,
        `Albums`.`id` AS `Albums__id`, `Albums`.`name` AS `Albums__name`,
        `Videos`.`id` AS `Videos__id`, `Videos`.`hoster` AS `Videos__hoster`,
        `Videos`.`video_key` AS `Videos__video_key`, `Videos`.`name` AS `Videos__name`,
        `Videos`.`description` AS `Videos__description`
    FROM  `users_activities` `UsersActivities`
    left JOIN  `users` `Users`  ON UsersActivities.user_id = Users.id
    left JOIN  `activities` `Activities`  ON UsersActivities.activity_id = Activities.id
    left JOIN  `wallposts` `Wallposts`  ON UsersActivities.wallpost_id = Wallposts.id
    LEFT JOIN  `videos` `Videos`  ON `Videos`.`id` = (`UsersActivities`.`video_id`)
    LEFT JOIN  `pictures` `Pictures`  ON `Pictures`.`id` = (`UsersActivities`.`picture_id`)
    LEFT JOIN  `albums` `Albums`  ON `Albums`.`id` = (`Pictures`.`album_id`)
    WHERE  (`UsersActivities`.`member_id` = 0
              AND  `Users`.`active` not in ('2')
              AND  `UsersActivities`.`user_activity_id` = 0
              AND  (
                    (`UsersActivities`.`user_id` = 390900002
                              AND  `UsersActivities`.`activity_id` not in (5,6)
                    )
                      OR  (`UsersActivities`.`user_id` in 
                                (391407850,391511765,
                                        391511432,491511714,391512398,391204138,391407984,391000522,
                                        391408687,391511708,391305910,391511812,391511681,491512107,
                                        391408047,391408494, -- and hundreds more 
                                )
                              AND  `UsersActivities`.`visible_for_group` in (0,3)
                          )
                   )
           )
    ORDER BY  `UsersActivities`.`created` desc 

解释JSON格式

{
  "query_block": {
    "select_id": 1,
    "ordering_operation": {
      "using_temporary_table": true,
      "using_filesort": true,
      "nested_loop": [
        {
          "table": {
            "table_name": "UsersActivities",
            "access_type": "range",
            "possible_keys": [
              "user_id",
              "activity_id",
              "member_id",
              "user_activity_id",
              "visible_for_group_idx"
            ],
            "key": "user_id",
            "used_key_parts": [
              "user_id"
            ],
            "key_length": "8",
            "rows": 25623,
            "filtered": 100,
            "index_condition": "((`project`.`usersactivities`.`user_id` = 390900002) or (`project`.`usersactivities`.`user_id` in ( *[HUNDREDS OF FRIEND IDS]* )))",
            "attached_condition": "((`project`.`usersactivities`.`user_activity_id` = 0) and (`project`.`usersactivities`.`member_id` = 0) and (((`project`.`usersactivities`.`user_id` = 390900002) and (`project`.`usersactivities`.`activity_id` not in (5,6))) or ((`project`.`usersactivities`.`user_id` in ( *[HUNDRED OF FRIEND IDS]* )) and (`project`.`usersactivities`.`visible_for_group` in (0,3)))))"
          }
        },
        {
          "table": {
            "table_name": "Users",
            "access_type": "eq_ref",
            "possible_keys": [
              "PRIMARY"
            ],
            "key": "PRIMARY",
            "used_key_parts": [
              "id"
            ],
            "key_length": "8",
            "ref": [
              "project.UsersActivities.user_id"
            ],
            "rows": 1,
            "filtered": 100,
            "attached_condition": "(`project`.`users`.`active` <> '2')"
          }
        },
        {
          "table": {
            "table_name": "Activities",
            "access_type": "ALL",
            "possible_keys": [
              "PRIMARY"
            ],
            "rows": 6,
            "filtered": 83.333,
            "using_join_buffer": "Block Nested Loop",
            "attached_condition": "<if>(is_not_null_compl(Activities), (`project`.`activities`.`id` = `project`.`usersactivities`.`activity_id`), true)"
          }
        },
        {
          "table": {
            "table_name": "Wallposts",
            "access_type": "eq_ref",
            "possible_keys": [
              "PRIMARY"
            ],
            "key": "PRIMARY",
            "used_key_parts": [
              "id"
            ],
            "key_length": "4",
            "ref": [
              "project.UsersActivities.wallpost_id"
            ],
            "rows": 1,
            "filtered": 100
          }
        },
        {
          "table": {
            "table_name": "Videos",
            "access_type": "eq_ref",
            "possible_keys": [
              "PRIMARY"
            ],
            "key": "PRIMARY",
            "used_key_parts": [
              "id"
            ],
            "key_length": "4",
            "ref": [
              "project.UsersActivities.video_id"
            ],
            "rows": 1,
            "filtered": 100
          }
        },
        {
          "table": {
            "table_name": "Pictures",
            "access_type": "eq_ref",
            "possible_keys": [
              "PRIMARY"
            ],
            "key": "PRIMARY",
            "used_key_parts": [
              "id"
            ],
            "key_length": "4",
            "ref": [
              "project.UsersActivities.picture_id"
            ],
            "rows": 1,
            "filtered": 100
          }
        },
        {
          "table": {
            "table_name": "Albums",
            "access_type": "eq_ref",
            "possible_keys": [
              "PRIMARY"
            ],
            "key": "PRIMARY",
            "used_key_parts": [
              "id"
            ],
            "key_length": "4",
            "ref": [
              "project.Pictures.album_id"
            ],
            "rows": 1,
            "filtered": 100
          }
        }
      ]
    }
  }
}

In the FrontEnd I use a foreach to loop

我在那里停止阅读。您应该能够进行单个 SQL 查询来定位所有朋友,JOIN 他们的活动,并生成按日期排序的完整活动列表。

来回访问数据库的成本很高。收集、消化、排序等所有需要的数据在数据库服务器中完成并在单个查询中完成通常更快。

如需更多帮助,我们需要同时查看 "find friends" 查询和 "gather activities" 查询,以便我们帮助您将它们放在一起。

关于您提供的查询的一些注释:

  • OR 阻止索引的使用,至少在那部分是这样。
  • NOT IN -- 同上。
  • 不要使用 LEFT,除非您预计 'right' table 会很乱。这让我们更难理解正在发生的事情,并弄清楚哪些优化可用/不可用。
  • 会应用 LIMIT 吗?看起来这可能会产生大量数据。会有分页吗? CakePHP 是否收集了不计其数的行,然后分割出一些行?坏消息。

建议索引:

UsersActivities:  (user_activity_id, member_id, created)

如果您添加该索引,请提供 EXPLAIN 以便我们将其与当前的 EXPLAIN 进行比较。

(看起来大多数 JOINs 都在 id 上,我假设是它们各自 table 的 PRIMARY KEY。)

另一个可能的改进:目前你有

SELECT ...
    `Activities`.`name` AS `Activities__name`,
    ...
  left JOIN  `activities` `Activities`
         ON UsersActivities.activity_id = Activities.id
    ...

这似乎是对 Activities 的唯一引用。它似乎是一个 1:many 映射,因此在速度和用户友好性方面可能会有所改进:

SELECT ...
    ( SELECT GROUP_CONCAT(DISTINCT `name`) FROM activities
        WHERE id = UsersActivities.activity_id
    )  AS `Activities__name`,
    ...
  -- And leave out left JOIN  `activities` 
    ...