与此 SQLite 语句等效的正确 SQLAlchemy 是什么?

What is the correct SQLAlchemy equivalent to this SQLite statement?

我正在尝试获取每个团队的最新变化。

SQLite 语句
按预期工作。

SELECT * FROM (
  SELECT * FROM team_history ORDER BY changed_at DESC
) sub GROUP BY team

SQLAlchemy 实现
无论出于何种原因,我必须使用 asc() 而不是 desc() 进行排序以获得相同的结果,这就是为什么我怀疑我的实现是否正确。

session.query(TeamHistory)\
    .select_entity_from(
        session.query(TeamHistory).order_by(asc(TeamHistory.changed_at)).subquery()
    ).group_by(TeamHistory.team)\
    .all()

环境

Python: 3.8.0
SQLAlchemy:1.3.23

复制

架构:

CREATE TABLE "team_history" (ID integer PRIMARY KEY, changed_at TEXT, team TEXT);

记录:

[{"ID":1,"changed_at":"2021-03-02 10:00:00","team":"B"},
 {"ID":2,"changed_at":"2021-03-02 10:00:00","team":"A"},
 {"ID":3,"changed_at":"2021-03-02 10:30:00","team":"B"},
 {"ID":4,"changed_at":"2021-03-02 10:00:00","team":"A"},
 {"ID":5,"changed_at":"2021-03-02 11:30:00","team":"B"},
 {"ID":6,"changed_at":"2021-03-02 10:00:00","team":"A"},
 {"ID":7,"changed_at":"2021-03-02 11:00:00","team":"B"},
 {"ID":8,"changed_at":"2021-03-02 10:00:00","team":"A"},
 {"ID":9,"changed_at":"2021-03-02 10:00:00","team":"B"},
 {"ID":10,"changed_at":"2021-03-02 10:00:00","team":"A"},
 {"ID":11,"changed_at":"2021-03-02 10:00:00","team":"B"},
 {"ID":12,"changed_at":"2021-03-02 10:00:00","team":"A"},
 {"ID":13,"changed_at":"2021-03-02 10:00:00","team":"B"},
 {"ID":14,"changed_at":"2021-03-02 12:30:00","team":"A"},
 {"ID":15,"changed_at":"2021-03-02 10:00:00","team":"B"},
 {"ID":16,"changed_at":"2021-03-02 12:00:00","team":"A"},
 {"ID":17,"changed_at":"2021-03-02 10:00:00","team":"B"},
 {"ID":18,"changed_at":"2021-03-02 13:30:00","team":"A"},
 {"ID":19,"changed_at":"2021-03-02 10:00:00","team":"B"},
 {"ID":20,"changed_at":"2021-03-02 10:00:00","team":"A"}]

解决方案
谢谢大家!

session.query(TeamHistory)\
    .group_by(TeamHistory.team)\
    .having(func.max(TeamHistory.changed_at))\
    .all()

当您使用时:

SELECT *
FROM tablename
GROUP BY somecolumn

SQLite returns somecolumn 的每个不同值 1 行,但是哪一行?
documentation 表示该行未定义,这意味着它是任意选择的,尽管根据我的经验,似乎将返回属于每个组的结果集中的第一行。
但这并不能保证,应该避免像上面的查询和您的查询。

有多种方法可以为每个 team 获取具有最新 changed_at 的行。
其中之一在 SQLite 中有效(尽管它在其他数据库中不起作用)是:

SELECT * FROM team_history GROUP BY team HAVING MAX(changed_at)

参见demo
因此,这是您应该转换为 SQLAlchemy 的查询(我无法帮助您)。

还有其他方法,使用 window 函数,或 EXISTS

基本上您提出的查询应该有效。 SQLite 将日期存储为 ISO-8601 格式的字符串,其中 属性 字典顺序和时间顺序相同。同样使用日期时间列作为 TEXT 具有相同的属性。

因此子查询的排序 desc 应导致以下结果:

ID DateTime Team
18 2021-03-02 13:30:00 A
14 2021-03-02 12:30:00 A
16 2021-03-02 12:00:00 A
5 2021-03-02 11:30:00 B
.. .. ..

如另一个答案所述,group by 的问题是为每个组任意选择 returned 行,但它看起来总是第一行。知道这一点,我们有不同的解决方案来确定它应该 return:

# Using an aggregate function in select
session.query(TeamHistory.team, func.max(TeamHistory.changed_at)).group_by(TeamHistory.team).all()

# Using an aggregate function with `having`
session.query(TeamHistory).group_by(TeamHistory.team).having(func.max(TeamHistory.changed_at).all()

这导致以下工作示例:

from sqlalchemy import Column, Integer, create_engine, Text, DateTime, func
from sqlalchemy.orm import sessionmaker, deferred, column_property
from sqlalchemy.ext.declarative import declarative_base
import datetime

Base = declarative_base()

class TeamHistory(Base):
    __tablename__ = 'team_history'
    id = Column(Integer, primary_key=True)
    changed_at = Column(DateTime)
    team = Column(Text)

if __name__ == '__main__':
    engine = create_engine('sqlite://')
    Base.metadata.create_all(engine)
    Session = sessionmaker(engine)

    db = Session()
    
    lst = [{"ID":1,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":2,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":3,"changed_at":"2021-03-02 10:30:00","team":"B"},{"ID":4,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":5,"changed_at":"2021-03-02 11:30:00","team":"B"},{"ID":6,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":7,"changed_at":"2021-03-02 11:00:00","team":"B"},{"ID":8,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":9,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":10,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":11,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":12,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":13,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":14,"changed_at":"2021-03-02 12:30:00","team":"A"},{"ID":15,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":16,"changed_at":"2021-03-02 12:00:00","team":"A"},{"ID":17,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":18,"changed_at":"2021-03-02 13:30:00","team":"A"},{"ID":19,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":20,"changed_at":"2021-03-02 10:00:00","team":"A"}]
    for dct in lst:
        dt = datetime.datetime.strptime(dct.get('changed_at'), '%Y-%m-%d %H:%M:%S')
        nth = TeamHistory(id=dct.get('ID'), changed_at=dt, team=dct.get('team'))
        db.add(nth)

    db.commit()

    res = db.query(TeamHistory)\
        .select_entity_from(
            db.query(TeamHistory).order_by(TeamHistory.changed_at.desc()).subquery()
        ).group_by(TeamHistory.team)\
        .all()

    for r in res:
        print(r.id, r.changed_at, r.team)
    print()


    res = db.query(TeamHistory)\
        .group_by(TeamHistory.team)\
        .having(func.max(TeamHistory.changed_at))\
        .all()

    for r in res:
        print(r.id, r.changed_at, r.team)
    print()

    res = db.query(TeamHistory.id, func.max(TeamHistory.changed_at), TeamHistory.team)\
        .group_by(TeamHistory.team)\
        .all()

    for r in res:
        print(r[0], r[1], r[2])

如果你主要对球队的最后变化感兴趣你也可以使用deferredcolumn_property只得到MAX(changed_at).

from sqlalchemy import Column, Integer, create_engine, Text, DateTime, func, select
from sqlalchemy.orm import sessionmaker, deferred, column_property
from sqlalchemy.ext.declarative import declarative_base
import datetime

Base = declarative_base()

class TeamHistory(Base):
    __tablename__ = 'team_history_def'
    id = Column(Integer, primary_key=True)
    changed_at = deferred(Column(DateTime))
    last_changed = column_property(func.max(changed_at))
    team = Column(Text)

if __name__ == '__main__':
    engine = create_engine('sqlite://')
    Base.metadata.create_all(engine)
    Session = sessionmaker(engine)

    db = Session()
    
    lst = [{"ID":1,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":2,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":3,"changed_at":"2021-03-02 10:30:00","team":"B"},{"ID":4,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":5,"changed_at":"2021-03-02 11:30:00","team":"B"},{"ID":6,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":7,"changed_at":"2021-03-02 11:00:00","team":"B"},{"ID":8,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":9,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":10,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":11,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":12,"changed_at":"2021-03-02 10:00:00","team":"A"},{"ID":13,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":14,"changed_at":"2021-03-02 12:30:00","team":"A"},{"ID":15,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":16,"changed_at":"2021-03-02 12:00:00","team":"A"},{"ID":17,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":18,"changed_at":"2021-03-02 13:30:00","team":"A"},{"ID":19,"changed_at":"2021-03-02 10:00:00","team":"B"},{"ID":20,"changed_at":"2021-03-02 10:00:00","team":"A"}]
    for dct in lst:
        dt = datetime.datetime.strptime(dct.get('changed_at'), '%Y-%m-%d %H:%M:%S')
        nth = TeamHistory(id=dct.get('ID'), changed_at=dt, team=dct.get('team'))
        db.add(nth)

    db.commit()

    res = db.query(TeamHistory)\
        .group_by(TeamHistory.team)\
        .all()

    for r in res:
        print(r.id, r.changed_at, r.team)
    print()