如何在按日期分组时获取计数的最小日期？

Question

（这建立在我之前问过的一个问题的基础上）我有一个名为 users 的 table 保存用户 ID，以及一些 tables，如 cloud_storage_a、cloud_storage_b 和 cloud_storage_c。如果 cloud_storage_a 中存在用户，则表示他们已连接到云存储 a。一个用户也可以存在于多个云存储中。这是一个例子：

id      | address      | name      | created_at
--------------------------------------------------
123     | 23 Oak Ave   | Melissa   | 2014-05-12
333     | 18 Robson Rd | Steve     | 2015-01-20
421     | 95 Ottawa St | Helen     | 2015-02-10
555     | 12 Highland  | Amit      | 2015-05-17
192     | 39 Anchor Rd | Oliver    | 2015-08-25

cloud_storage_a

user_id | created_at
---------------------
 421    | 2015-03-05
 333    | 2015-02-01

cloud_storage_b

user_id | created_at
----------------------
 555    | 2015-07-20

cloud_storage_c

user_id | created_at
---------------------
 192    | 2015-08-26
 555    | 2015-08-01

我有一个查询，以确定从他们注册时加入任何帐户的用户数量：

SELECT
    concat(extract(MONTH FROM u.created_at),
            '-',extract(YEAR FROM u.created_at)) AS "Month-Year",
    count(s1.user_id) AS "# of Users that Signed up on Any Cloud"
FROM (
        SELECT user_id, created_at FROM cloud_storage_a
        UNION
        SELECT user_id, created_at FROM cloud_storage_b
        UNION 
        SELECT user_id, created_at FROM cloud_storage_c
    ) AS s1
    INNER JOIN users u ON u.id = s1.user_id
GROUP BY
    1,
    EXTRACT(MONTH from u.created_at),
    EXTRACT(YEAR from u.created_at)
ORDER BY
    EXTRACT(YEAR from u.created_at),
    EXTRACT(MONTH from u.created_at);

但据我了解——这并没有抓住最低限度。（例如，对于用户 555，他们加入了 cloud b 07-2015 和 cloud c 08-2015。我认为我的查询现在不计入该最短日期 - 我将如何完成此操作？

Answer 1

在你的 table 定义之后 cloud_storage_a 我停止阅读了。

糟糕的 table 设计。你定义的table应该是table中的一行。您不应继续处理此数据模型。

考虑这样的事情：

create table cloudstorages (
  id serial not null primary key, 
     -- more attributes...
  info text);

create table user_storage (
  id      serial not null primary key,
  uid     integer references users(id),
  storage integer references cloudstorages(id)
  );

如果还是不行，就带着新的设计回来

编辑：刚看到你的日期格式。使用 to_char() 查看 data type formatting。

编辑：您必须使用这些 tables，好吧...我会在结果集中将年份和月份设为 2 列，这应该有助于分组。

Answer 2

好的，据我了解你想要的输出，你只需要将 DISTINCT 添加到你的 COUNT() 函数，而且我认为如果逻辑（按两列分组，在子查询中排序输出）和更高级别查询中的格式输出。所以最后：

SELECT "Month" || '-' || "Year", "Count"
FROM (
    SELECT
        extract(MONTH from u.created_at) as "Month", extract(YEAR from u.created_at)) as "Year",
        count(DISTINCT u.id) as "Count"
    FROM users u
        JOIN (  SELECT user_id, created_at FROM cloud_storage_a
                UNION
                SELECT user_id, created_at FROM cloud_storage_b
                UNION
                SELECT user_id, created_at FROM cloud_storage_c
        ) AS s1 ON  s1.user_id = u.id
                    AND u.created_at <= s1.created_at
    GROUP BY
        EXTRACT(MONTH from u.created_at),
        EXTRACT(YEAR from u.created_at)
    ORDER BY
        EXTRACT(YEAR from u.created_at),
        EXTRACT(MONTH from u.created_at)
) sub

我还添加了日期检查来满足您

的要求

...joined any account from the time they signed up...

如何在按日期分组时获取计数的最小日期？

How to grab minimum date for count of while doing group by date?

postgresql

aggregate-functions

postgresql-9.3