SQL - 查询缺少属性的实体属性值 (EAV)
SQL - querying Entity-Attribute-Value (EAV) with missing attributes
我的数据库相当于以下 table:
id | foo | bar
---+------+-----
1 | 5 | 6
2 | 7 | NULL
但是,不幸的是,实现为实体属性值:
CREATE TABLE obj(id INTEGER NOT NULL PRIMARY KEY);
CREATE TABLE attrdef(id INTEGER NOT NULL PRIMARY KEY, name VARCHAR(4));
CREATE TABLE attr(obj_id INTEGER NOT NULL, attrdef_id INTEGER NOT NULL, value INTEGER NOT NULL);
INSERT INTO obj VALUES(1);
INSERT INTO obj VALUES(2);
INSERT INTO attrdef VALUES(3, 'foo');
INSERT INTO attrdef VALUES(4, 'bar');
INSERT INTO attr VALUES(1,3,5);
INSERT INTO attr VALUES(1,4,6);
INSERT INTO attr VALUES(2,3,7);
我需要查询该数据库以获取 "proper" 形式的数据 - 如示例 table 所示。我试过:
SELECT obj.id, foo.value, bar.value
FROM obj
LEFT JOIN attr foo ON (obj.id = foo.obj_id)
LEFT JOIN attrdef foo_def ON (foo.attrdef_id = foo_def.id)
LEFT JOIN attr bar ON (obj.id = bar.obj_id)
LEFT JOIN attrdef bar_def ON (bar.attrdef_id = bar_def.id)
WHERE foo_def.name = 'foo' AND bar_def.name = 'bar';
但缺少第二行:
id | foo | bar
---+------+-----
1 | 5 | 6
和
SELECT obj.id,
MAX(CASE WHEN name='foo' THEN value ELSE NULL END) foo,
MAX(CASE WHEN name='bar' THEN value ELSE NULL END) bar
FROM obj LEFT JOIN attr ON (obj.id = attr.obj_id)
LEFT JOIN attrdef ON (attr.attrdef_id = attrdef.id)
GROUP BY obj.id;
给出正确结果:
id | foo | bar
---+------+-----
1 | 5 | 6
2 | 7 | NULL
但是这个查询的性能是无法接受的table。
我想要标准 SQL 查询,但是 MySQL 特定的好的解决方案将不胜感激。
您只需将条件移动到 on
子句:
SELECT obj.id, foo.value, bar.value
FROM obj LEFT JOIN
attr foo
ON obj.id = foo.obj_id LEFT JOIN
attrdef foo_def
ON foo.attrdef_id = foo_def.id AND foo_def.name = 'foo' LEFT JOIN
attr bar
ON obj.id = bar.obj_id LEFT JOIN
attrdef bar_def
ON bar.attrdef_id = bar_def.id AND bar_def.name = 'bar';
对于聚合方法,我会选择:
SELECT obj.id,
MAX(CASE WHEN name = 'foo' THEN value END) foo,
MAX(CASE WHEN name = 'bar' THEN value END) bar
FROM obj LEFT JOIN
attr
ON obj.id = attr.obj_id LEFT JOIN
attrdef
ON attr.attrdef_id = attrdef.id
WHERE name IN ('foo', 'bar')
GROUP BY obj.id;
在这种情况下可能不需要 left join
(取决于缺失值的分布)。无论如何,如果您开始查看越来越多的属性,JOIN
方法需要的时间会越来越长。 GROUP BY
方法具有大致相同的性能。
编辑:
正确的查询是:
SELECT obj.id, foo.value, bar.value
FROM obj LEFT JOIN
(attr foo JOIN
attrdef foo_def
ON foo.attrdef_id = foo_def.id AND foo_def.name = 'foo'
)
ON obj.id = foo.obj_id LEFT JOIN
(attr bar JOIN
attrdef bar_def
ON bar.attrdef_id = bar_def.id AND bar_def.name = 'bar'
)
ON obj.id = bar.obj_id ;
Here 是 SQL Fiddle.
当您在 where 子句中执行此操作时:
AND bar_def.name = 'bar';
您将 bar_def 上的左联接转换为内部联接。与您在 Foo_def.
上设置的条件相同
我的数据库相当于以下 table:
id | foo | bar
---+------+-----
1 | 5 | 6
2 | 7 | NULL
但是,不幸的是,实现为实体属性值:
CREATE TABLE obj(id INTEGER NOT NULL PRIMARY KEY);
CREATE TABLE attrdef(id INTEGER NOT NULL PRIMARY KEY, name VARCHAR(4));
CREATE TABLE attr(obj_id INTEGER NOT NULL, attrdef_id INTEGER NOT NULL, value INTEGER NOT NULL);
INSERT INTO obj VALUES(1);
INSERT INTO obj VALUES(2);
INSERT INTO attrdef VALUES(3, 'foo');
INSERT INTO attrdef VALUES(4, 'bar');
INSERT INTO attr VALUES(1,3,5);
INSERT INTO attr VALUES(1,4,6);
INSERT INTO attr VALUES(2,3,7);
我需要查询该数据库以获取 "proper" 形式的数据 - 如示例 table 所示。我试过:
SELECT obj.id, foo.value, bar.value
FROM obj
LEFT JOIN attr foo ON (obj.id = foo.obj_id)
LEFT JOIN attrdef foo_def ON (foo.attrdef_id = foo_def.id)
LEFT JOIN attr bar ON (obj.id = bar.obj_id)
LEFT JOIN attrdef bar_def ON (bar.attrdef_id = bar_def.id)
WHERE foo_def.name = 'foo' AND bar_def.name = 'bar';
但缺少第二行:
id | foo | bar
---+------+-----
1 | 5 | 6
和
SELECT obj.id,
MAX(CASE WHEN name='foo' THEN value ELSE NULL END) foo,
MAX(CASE WHEN name='bar' THEN value ELSE NULL END) bar
FROM obj LEFT JOIN attr ON (obj.id = attr.obj_id)
LEFT JOIN attrdef ON (attr.attrdef_id = attrdef.id)
GROUP BY obj.id;
给出正确结果:
id | foo | bar
---+------+-----
1 | 5 | 6
2 | 7 | NULL
但是这个查询的性能是无法接受的table。
我想要标准 SQL 查询,但是 MySQL 特定的好的解决方案将不胜感激。
您只需将条件移动到 on
子句:
SELECT obj.id, foo.value, bar.value
FROM obj LEFT JOIN
attr foo
ON obj.id = foo.obj_id LEFT JOIN
attrdef foo_def
ON foo.attrdef_id = foo_def.id AND foo_def.name = 'foo' LEFT JOIN
attr bar
ON obj.id = bar.obj_id LEFT JOIN
attrdef bar_def
ON bar.attrdef_id = bar_def.id AND bar_def.name = 'bar';
对于聚合方法,我会选择:
SELECT obj.id,
MAX(CASE WHEN name = 'foo' THEN value END) foo,
MAX(CASE WHEN name = 'bar' THEN value END) bar
FROM obj LEFT JOIN
attr
ON obj.id = attr.obj_id LEFT JOIN
attrdef
ON attr.attrdef_id = attrdef.id
WHERE name IN ('foo', 'bar')
GROUP BY obj.id;
在这种情况下可能不需要 left join
(取决于缺失值的分布)。无论如何,如果您开始查看越来越多的属性,JOIN
方法需要的时间会越来越长。 GROUP BY
方法具有大致相同的性能。
编辑:
正确的查询是:
SELECT obj.id, foo.value, bar.value
FROM obj LEFT JOIN
(attr foo JOIN
attrdef foo_def
ON foo.attrdef_id = foo_def.id AND foo_def.name = 'foo'
)
ON obj.id = foo.obj_id LEFT JOIN
(attr bar JOIN
attrdef bar_def
ON bar.attrdef_id = bar_def.id AND bar_def.name = 'bar'
)
ON obj.id = bar.obj_id ;
Here 是 SQL Fiddle.
当您在 where 子句中执行此操作时:
AND bar_def.name = 'bar';
您将 bar_def 上的左联接转换为内部联接。与您在 Foo_def.
上设置的条件相同