选择按日期排序的每个组的前 N ​​行

Selecting the first N rows of each group ordered by date

我正在尝试列出按 DateTime 分组排序的前 N ​​行(第 100 行),例如 Master Detail。

USE [Test]
Create Table [dbo].[Masters] (
    [MasterId] [nchar](36) NOT NULL PRIMARY KEY,
    [Tags] [nchar](100) NULL,
    [Numbers] [int] NOT NULL
);

Create Table [dbo].[Details] (
    [DetailId] [nchar](36) NOT NULL PRIMARY KEY,
    [MasterId] [nchar](36) FOREIGN KEY REFERENCES Masters(MasterId),
    [Date_Time] [datetime2](7) NOT NULL,
    [Value] [int] NOT NULL
);


INSERT INTO Masters (MasterId, Tags, Numbers) VALUES ('M0', 'Tag0,Tag1', 6);
INSERT INTO Masters (MasterId, Tags, Numbers) VALUES ('M1', 'Tag1,Tag2', 5);
INSERT INTO Masters (MasterId, Tags, Numbers) VALUES ('M2', 'Tag0,Tag2', 6);

INSERT INTO Details (DetailId, MasterId, Date_Time, Value) VALUES ('D0', 'M0', '20190101 00:00:00 AM', 0);
INSERT INTO Details (DetailId, MasterId, Date_Time, Value) VALUES ('D1', 'M0', '20200101 11:00:00 AM', 1);
INSERT INTO Details (DetailId, MasterId, Date_Time, Value) VALUES ('D2', 'M0', '20200701 01:00:00 AM', 2);
INSERT INTO Details (DetailId, MasterId, Date_Time, Value) VALUES ('D3', 'M0', '20210715 10:00:00 AM', 3);
INSERT INTO Details (DetailId, MasterId, Date_Time, Value) VALUES ('D4', 'M0', '20210715 11:00:00 AM', 4);
INSERT INTO Details (DetailId, MasterId, Date_Time, Value) VALUES ('D5', 'M0', '20210715 11:00:00 AM', 5);

INSERT INTO Details (DetailId, MasterId, Date_Time, Value) VALUES ('D10', 'M1', '20190101 00:00:00 AM', 6);
INSERT INTO Details (DetailId, MasterId, Date_Time, Value) VALUES ('D11', 'M1', '20200101 01:00:00 AM', 7);
INSERT INTO Details (DetailId, MasterId, Date_Time, Value) VALUES ('D12', 'M1', '20200701 09:00:00 AM', 8);
INSERT INTO Details (DetailId, MasterId, Date_Time, Value) VALUES ('D13', 'M1', '20210101 10:00:00 AM', 9);
INSERT INTO Details (DetailId, MasterId, Date_Time, Value) VALUES ('D14', 'M1', '20210701 10:00:00 AM', 10);

INSERT INTO Details (DetailId, MasterId, Date_Time, Value) VALUES ('D20', 'M2', '20190101 00:00:00 AM', 11);
INSERT INTO Details (DetailId, MasterId, Date_Time, Value) VALUES ('D21', 'M2', '20190101 01:30:00 AM', 12);
INSERT INTO Details (DetailId, MasterId, Date_Time, Value) VALUES ('D22', 'M2', '20200101 01:30:00 AM', 13);
INSERT INTO Details (DetailId, MasterId, Date_Time, Value) VALUES ('D23', 'M2', '20200701 08:30:00 AM', 14);
INSERT INTO Details (DetailId, MasterId, Date_Time, Value) VALUES ('D24', 'M2', '20210101 01:30:00 AM', 15);
INSERT INTO Details (DetailId, MasterId, Date_Time, Value) VALUES ('D25', 'M2', '20210701 01:30:00 AM', 16);

Select * from Masters;
Select * from Details;
--

现在我的部分查询:

SELECT m.MasterId, d.DetailId, m.Numbers, d.Date_Time, d.Value from Details AS d
INNER JOIN Masters AS m ON m.MasterId = d.MasterId
WHERE 
m.Tags LIKE '%Tag2%' AND 
d.Date_Time >= Convert(datetime, '2020-01-01' ) 
ORDER BY m.MasterId DESC, d.Date_Time;

但是,如何在这个例子中为我的查询引入 Top 3(实际情况下可能真的是 50 或 100)? 我只想获得 3 行每 MasterId.

根据图像,我们只会得到六行。 请帮我解决我的问题。

您可以在这里使用row_number() window功能。

SELECT x.masterid,
       x.detailid,
       x.numbers,
       x.date_time,
       x.value
       FROM (SELECT m.masterid,
                    d.detailid,
                    m.numbers,
                    d.date_time,
                    d.value,
                    row_number() OVER (PARTITION BY m.masterid
                                       ORDER BY d.date_time ASC) AS rn
                    FROM details AS d
                         INNER JOIN masters AS m
                                    ON m.masterid = d.masterid
                    WHERE m.tags LIKE '%Tag2%'
                          AND d.date_time >= '2020-01-01') AS x
       WHERE x.rn <= 3 -- change to whatever your n is
       ORDER BY x.masterid DESC,
                x.date_time ASC;

除了row_number解决方案,另一种选择是CROSS APPLY(SELECT TOP:

SELECT m.masterid,
       d.detailid,
       m.numbers,
       d.date_time,
       d.value
    FROM masters AS m
    CROSS APPLY (
        SELECT TOP (3) *
        FROM details AS d
        WHERE d.date_time >= '2020-01-01'
        AND m.masterid = d.masterid
    ) AS d
    WHERE m.tags LIKE '%Tag2%'
    ORDER BY m.masterid DESC,
             d.date_time;

这可能比 row_number 快或慢,主要取决于基数(行数)和索引。

如果索引良好并且行数较少,通常速度会更快。如果内部 table 需要排序,或者您选择了大多数行,则使用 row_number.