SQLAlchemy - 多对多次级关系的急切加载未按预期工作
SQLAlchemy - eager load for many to many secondary relation not working as expected
在我们的系统中,我们有实体 Item 和 Store,它们与 Stock 实体相关。
一件商品可以在多个商店进货,也可以在一个商店进货多个商品,所以一个简单的多对多关系。
然而,当用次要参考描述这种关系时:
stores = relationship(
'Store',
secondary='stock',
backref='items'
)
SQLAlchemy 加载相关商店的所有袜子,而不仅仅是那些与引用项目相关的袜子。
例如。当我们指定生成以下 sql:
的关系时
SELECT item.id AS item_id, store.id AS store_id, stock.id AS stock_id, stock.store_id AS stock_store_id, stock.item_id AS stock_item_id
FROM item
LEFT OUTER JOIN (stock AS stock_1 JOIN store ON store.id = stock_1.store_id) ON item.id = stock_1.item_id
LEFT OUTER JOIN stock ON store.id = stock.store_id AND stock.item_id = item.id
WHERE stock.item_id = item.id
即returns以下数据:
item_id, store_id, stock_id, stock_store_id, stock_item_id,
1, 1, 1, 1, 1
2, 1, 2, 1, 2
1, 2, 3, 2, 1
2, 2, 4, 2, 2
实际加载的数据如下:
items = [{
id: 1,
stores: [
{
id: 1,
stocks: [
{ id: 1, item_id: 1 },
{ id: 2, item_id: 2 } <- should not be loaded items[0].id != 2
]
},
{
id: 2,
stocks: [
{ id: 3, item_id: 1 },
{ id: 4, item_id: 2 } <- should not be loaded items[0].id != 2
]
}
]
},
{
id: 2,
stores: [
{
id: 1,
stocks: [
{ id: 2, item_id: 2 },
{ id: 1, item_id: 1 } <- should not be loaded items[1].id != 1
]
},
{
id: 2,
stocks: [
{ id: 4, item_id: 2 },
{ id: 3, item_id: 1 } <- should not be loaded items[1].id != 1
]
}
]
}]
作为参考,看一下实体及其关系的声明,以及查询对象:
Base = declarative_base()
class Item(Base):
__tablename__ = 'item'
id = Column(Integer, primary_key=True)
stores = relationship(
'Store',
secondary='stock',
backref='items'
)
class Store(Base):
__tablename__ = 'store'
id = Column(Integer, primary_key=True)
class Stock(Base):
__tablename__ = 'stock'
id = Column(Integer, primary_key=True)
store_id = Column(Integer, ForeignKey(Store.id), nullable=False)
item_id = Column(Integer, ForeignKey(Item.id), nullable=False)
item = relationship(Item, backref='stocks')
store = relationship(Store, backref='stocks')
items = session.query(
Item
).outerjoin(
Item.stores,
(Stock, and_(Store.id == Stock.store_id, Stock.item_id == Item.id))
).filter(
Stock.item_id == Item.id,
).options(
contains_eager(
Item.stores
).contains_eager(
Store.stocks
)
).all()
那是因为id
相同的店铺是同一个Store
实例。
当 serializing/displaying 结果时进行显式过滤可能更好。
也就是说,可以覆盖 Item
的 __getattribute__
以拦截 Item.stores
到 return _ItemStore
包装器,只有 return stocks
与父 Item.id
.
相同 item_id
class Item(Base):
# ...
class _ItemStore:
def __init__(self, store, item_id):
self.id = store.id
self._item_id = item_id
self._store = store
@property
def stocks(self):
return [stock for stock in self._store.stocks if stock.item_id == self._item_id]
def __getattribute__(self, item):
value = super().__getattribute__(item)
if item == 'stores':
value = [self._ItemStore(store, self.id) for store in value]
return value
添加一个简单的缓存以便 item.stores == item.stores
:
def __getattribute__(self, item):
value = super().__getattribute__(item)
if item == 'stores':
cache = getattr(self, '_stores', None)
if cache is None:
cache = self._stores = {}
item_id = self.id
item_store_cls = self._ItemStore
value = [cache.setdefault(id(store), item_store_cls(store, item_id)) for store in value]
return value
在我们的系统中,我们有实体 Item 和 Store,它们与 Stock 实体相关。 一件商品可以在多个商店进货,也可以在一个商店进货多个商品,所以一个简单的多对多关系。
然而,当用次要参考描述这种关系时:
stores = relationship(
'Store',
secondary='stock',
backref='items'
)
SQLAlchemy 加载相关商店的所有袜子,而不仅仅是那些与引用项目相关的袜子。
例如。当我们指定生成以下 sql:
的关系时SELECT item.id AS item_id, store.id AS store_id, stock.id AS stock_id, stock.store_id AS stock_store_id, stock.item_id AS stock_item_id
FROM item
LEFT OUTER JOIN (stock AS stock_1 JOIN store ON store.id = stock_1.store_id) ON item.id = stock_1.item_id
LEFT OUTER JOIN stock ON store.id = stock.store_id AND stock.item_id = item.id
WHERE stock.item_id = item.id
即returns以下数据:
item_id, store_id, stock_id, stock_store_id, stock_item_id,
1, 1, 1, 1, 1
2, 1, 2, 1, 2
1, 2, 3, 2, 1
2, 2, 4, 2, 2
实际加载的数据如下:
items = [{
id: 1,
stores: [
{
id: 1,
stocks: [
{ id: 1, item_id: 1 },
{ id: 2, item_id: 2 } <- should not be loaded items[0].id != 2
]
},
{
id: 2,
stocks: [
{ id: 3, item_id: 1 },
{ id: 4, item_id: 2 } <- should not be loaded items[0].id != 2
]
}
]
},
{
id: 2,
stores: [
{
id: 1,
stocks: [
{ id: 2, item_id: 2 },
{ id: 1, item_id: 1 } <- should not be loaded items[1].id != 1
]
},
{
id: 2,
stocks: [
{ id: 4, item_id: 2 },
{ id: 3, item_id: 1 } <- should not be loaded items[1].id != 1
]
}
]
}]
作为参考,看一下实体及其关系的声明,以及查询对象:
Base = declarative_base()
class Item(Base):
__tablename__ = 'item'
id = Column(Integer, primary_key=True)
stores = relationship(
'Store',
secondary='stock',
backref='items'
)
class Store(Base):
__tablename__ = 'store'
id = Column(Integer, primary_key=True)
class Stock(Base):
__tablename__ = 'stock'
id = Column(Integer, primary_key=True)
store_id = Column(Integer, ForeignKey(Store.id), nullable=False)
item_id = Column(Integer, ForeignKey(Item.id), nullable=False)
item = relationship(Item, backref='stocks')
store = relationship(Store, backref='stocks')
items = session.query(
Item
).outerjoin(
Item.stores,
(Stock, and_(Store.id == Stock.store_id, Stock.item_id == Item.id))
).filter(
Stock.item_id == Item.id,
).options(
contains_eager(
Item.stores
).contains_eager(
Store.stocks
)
).all()
那是因为id
相同的店铺是同一个Store
实例。
当 serializing/displaying 结果时进行显式过滤可能更好。
也就是说,可以覆盖 Item
的 __getattribute__
以拦截 Item.stores
到 return _ItemStore
包装器,只有 return stocks
与父 Item.id
.
item_id
class Item(Base):
# ...
class _ItemStore:
def __init__(self, store, item_id):
self.id = store.id
self._item_id = item_id
self._store = store
@property
def stocks(self):
return [stock for stock in self._store.stocks if stock.item_id == self._item_id]
def __getattribute__(self, item):
value = super().__getattribute__(item)
if item == 'stores':
value = [self._ItemStore(store, self.id) for store in value]
return value
添加一个简单的缓存以便 item.stores == item.stores
:
def __getattribute__(self, item):
value = super().__getattribute__(item)
if item == 'stores':
cache = getattr(self, '_stores', None)
if cache is None:
cache = self._stores = {}
item_id = self.id
item_store_cls = self._ItemStore
value = [cache.setdefault(id(store), item_store_cls(store, item_id)) for store in value]
return value