3列之间的统计模式
Statistical Mode between 3 columns
我有一个 ~70K-entry table 订单,如下:
我想根据每个客户确定最常见的订单是什么,以及该订单的确定性(样本大小和概率)。
这是我目前拥有的:
CREATE VIEW CustomerOrderProbabaility as
SELECT Distinct(customerID)
customerID,
order,
COUNT(*) as sampleSize
FROM (Select customerID, order1 AS order FROM orderTable UNION
Select customerID, order2 AS order FROM orderTable UNION
Select customerID, order3 AS order FROM orderTable
)
GROUP BY customerID, order
ORDER BY customerID, COUNT(*) DESC;
我得到 table 个 customerId
和 order
,但 sampleSize
总是 1
。我哪里错了?
我想你想要 UNION ALL
以及其他一些更改:
CREATE VIEW CustomerOrderProbabaility as
SELECT DISTINCT ON (customerID)
customerID,
order,
COUNT(*) as sampleSize,
SUM(COUNT(*)) OVER (PARTITION BY customerId) as totOrders
FROM (Select customerID, order1 AS theorder FROM orderTable UNION ALL
Select customerID, order2 AS theorder FROM orderTable UNION ALL
Select customerID, order3 AS theorder FROM orderTable
) co
GROUP BY customerID, theorder
ORDER BY customerID, COUNT(*) DESC;
UNION
删除重复项。
变化:
- 已将
order
重命名为 theorder
。 order
是关键字。即使接受为专栏名称,我也不认为这是个好主意。
UNION ALL
而不是 UNION
,因此不会删除重复项。
DISTINCT ON
而不是 DISTINCT
,因为这是你的意图。
- 添加了
TotOrders
来计算每个客户的所有订单。
我有一个 ~70K-entry table 订单,如下:
我想根据每个客户确定最常见的订单是什么,以及该订单的确定性(样本大小和概率)。
这是我目前拥有的:
CREATE VIEW CustomerOrderProbabaility as
SELECT Distinct(customerID)
customerID,
order,
COUNT(*) as sampleSize
FROM (Select customerID, order1 AS order FROM orderTable UNION
Select customerID, order2 AS order FROM orderTable UNION
Select customerID, order3 AS order FROM orderTable
)
GROUP BY customerID, order
ORDER BY customerID, COUNT(*) DESC;
我得到 table 个 customerId
和 order
,但 sampleSize
总是 1
。我哪里错了?
我想你想要 UNION ALL
以及其他一些更改:
CREATE VIEW CustomerOrderProbabaility as
SELECT DISTINCT ON (customerID)
customerID,
order,
COUNT(*) as sampleSize,
SUM(COUNT(*)) OVER (PARTITION BY customerId) as totOrders
FROM (Select customerID, order1 AS theorder FROM orderTable UNION ALL
Select customerID, order2 AS theorder FROM orderTable UNION ALL
Select customerID, order3 AS theorder FROM orderTable
) co
GROUP BY customerID, theorder
ORDER BY customerID, COUNT(*) DESC;
UNION
删除重复项。
变化:
- 已将
order
重命名为theorder
。order
是关键字。即使接受为专栏名称,我也不认为这是个好主意。 UNION ALL
而不是UNION
,因此不会删除重复项。DISTINCT ON
而不是DISTINCT
,因为这是你的意图。- 添加了
TotOrders
来计算每个客户的所有订单。