从实体属性 Table 中选择两个最常见的属性对?
Selecting the two most common attribute pairings from a Entity-Attribute Table?
我的数据库中有一个简单的实体属性 table,通过包含 (Entity, Attribute)
.
的行的存在来简单描述实体是否具有某些属性
我想找出在所有 Entities
中有两个且只有两个 Attributes
,最常见的 Attribute
对
是什么
例如,如果我的 table 看起来像:
+--------+-----------+
| Entity | Attribute |
+--------+-----------+
| Bob | A |
| Sally | B |
| Terry | C |
| Bob | B |
| Sally | A |
| Terry | D |
| Larry | C |
+--------+-----------+
我想要它 return
+-------------+-------------+-------+
| Attribute-1 | Attribute-2 | Count |
+-------------+-------------+-------+
| A | B | 2 |
| C | D | 1 |
+-------------+-------------+-------+
我目前有一个简短的查询,如下所示:
WITH TwoAtts (
SELECT entity
FROM table
GROUP BY entity
HAVING COUNT(att) = 2
)
SELECT t1.att, t2.att, COUNT(entity)
FROM table t1
JOIN table t2
ON t1.entity = t2.entity
WHERE t1.entity IN (SELECT * FROM TwoAtts)
AND t1.att != t2.att
GROUP BY t1.att, t2.att
ORDER BY COUNT(entity) DESC
但只能产生像
这样的“重复”结果
+-------------+-------------+-------+
| Attribute-1 | Attribute-2 | Count |
+-------------+-------------+-------+
| A | B | 2 |
| B | A | 2 |
| D | C | 1 |
| C | D | 1 |
+-------------+-------------+-------+
从某种意义上说,我希望能够 运行 在两个属性列上使用无序的 DISTINCT / set 运算符,但我不确定如何在 SQL 中实现此功能?
嗯,我想你想要两级聚合,并进行一些过滤:
select attribute_1, attribute_2, count(*)
from (select min(ea.attribute) as attribute_1, max(ea.attribute) as attribute_2
from entity_attribute ea
group by entity
having count(*) = 2
) aa
group by attribute_1, attribute_2;
Here 是一个 db<>fiddle
我的数据库中有一个简单的实体属性 table,通过包含 (Entity, Attribute)
.
我想找出在所有 Entities
中有两个且只有两个 Attributes
,最常见的 Attribute
对
例如,如果我的 table 看起来像:
+--------+-----------+
| Entity | Attribute |
+--------+-----------+
| Bob | A |
| Sally | B |
| Terry | C |
| Bob | B |
| Sally | A |
| Terry | D |
| Larry | C |
+--------+-----------+
我想要它 return
+-------------+-------------+-------+
| Attribute-1 | Attribute-2 | Count |
+-------------+-------------+-------+
| A | B | 2 |
| C | D | 1 |
+-------------+-------------+-------+
我目前有一个简短的查询,如下所示:
WITH TwoAtts (
SELECT entity
FROM table
GROUP BY entity
HAVING COUNT(att) = 2
)
SELECT t1.att, t2.att, COUNT(entity)
FROM table t1
JOIN table t2
ON t1.entity = t2.entity
WHERE t1.entity IN (SELECT * FROM TwoAtts)
AND t1.att != t2.att
GROUP BY t1.att, t2.att
ORDER BY COUNT(entity) DESC
但只能产生像
这样的“重复”结果+-------------+-------------+-------+
| Attribute-1 | Attribute-2 | Count |
+-------------+-------------+-------+
| A | B | 2 |
| B | A | 2 |
| D | C | 1 |
| C | D | 1 |
+-------------+-------------+-------+
从某种意义上说,我希望能够 运行 在两个属性列上使用无序的 DISTINCT / set 运算符,但我不确定如何在 SQL 中实现此功能?
嗯,我想你想要两级聚合,并进行一些过滤:
select attribute_1, attribute_2, count(*)
from (select min(ea.attribute) as attribute_1, max(ea.attribute) as attribute_2
from entity_attribute ea
group by entity
having count(*) = 2
) aa
group by attribute_1, attribute_2;
Here 是一个 db<>fiddle