SQL 计算按外部表引用分组的值的出现次数

SQL count occurrences of values grouped by external tables references

就性能和可维护性而言,计算 table 中相同值的出现次数的最佳方法是什么,使用对 [=42] 的条目进行分组的相同引用对结果进行分组=]?

假设我有三个 table(为了表示与我正在处理的场景类似的场景,概念已经缩小):

|----------|   |----------------|   |-----------------------------------|
|   MEAL   |   |      RECIPE    |   |          INGREDIENT_ENTRY         |
|----------|   |----------------|   |-----------------------------------|
| ID | ... |   | ID | ID_m | ...|   | ID | ID_r | amount and description|
|----------|   |----------------|   |-----------------------------------|
|  1 | ... |   |  1 |    1 | ...|   |  1 |    1 |       '15gr of yeast' |
|  2 | ... |   |  2 |    2 | ...|   |  2 |    4 |              '2 eggs' |
|  3 | ... |   |  3 |    3 | ...|   |  3 |    1 |      '300cl of water' |
|  4 | ... |   |  4 |    4 | ...|   |  4 |    2 |       '300cl of beer' |
|----------|   |  5 |    1 | ...|   |  5 |    3 |       '250cl of milk' |
               |  6 |    4 | ...|   |  6 |    5 |   '100gr of biscuits' |
               |  7 |    5 | ...|   |  7 |    2 |       '15gr of yeast' |
               |  8 |    6 | ...|   |  8 |    1 |      '500gr of flour' |
               |----------------|   |  9 |    2 |      '500gr of flour' |
                                    | 10 |    2 |        '10gr of salt' |
                                    | 11 |    4 |       '15gr of yeast' |
                                    |-----------------------------------|

同一顿饭可以用不同的食谱烹制,每个食谱由不同的 INGREDIENT_ENTRYs 组成,通过共享相同的 ID_r 值组织在相同的食谱中。

INGREDIENT_ENTRY.[数量和描述]是VARCHAR(MAX)类型的列,这是必须要比较的值。

在示例中,使用 (MEAL 1,RECIPE 1) 进行查询:

它有 3 种成分 (1,3,8),分享:

结果应该类似于:

|------|   |--------|   |-------|
| MEAL |   | RECIPE |   | COUNT |
|------|   |--------|   |-------|
|    2 |   |      2 |   |     2 |
|    4 |   |      4 |   |     1 |
|------|   |--------|   |-------| 

我正在尝试使用视图来降低 SQL 复杂性,但我无法通过单个 SQL 语句来实现,我想避免来回编写代码 (C#) 和执行多个查询(例如,查询每种成分,并使用 HashMap 或类似方法协调结果)。

请注意,我无法修改数据库结构。

您可以使用 EXISTS 查找常见成分。在下面,我简单地使用了一个 Common table 表达式,这样我就不必多次写出连接来返回一个餐 ID:

DECLARE @SelectedMealID INT = 1;

WITH LinkedData AS
(
    SELECT  MealID = r.ID_m,
            RecipeID = r.ID,
            Ingredient = i.[amount and description]
    FROM    RECIPE AS r
            INNER JOIN INGREDIENT_ENTRY AS i
                ON i.ID_r = r.ID
)
SELECT  a.MealID,
        a.RecipeID,
        CommonIngedients = COUNT(*)
FROM    LinkedData AS a
WHERE   a.MealID != @SelectedMealID
AND     EXISTS
        (   SELECT  1
            FROM    LinkedData AS b
            WHERE   b.Ingredient = a.Ingredient
            AND     b.MealID = @SelectedMealID
        )
GROUP BY a.MealID, a.RecipeID;

我已经用下面的示例对此进行了测试:

-- GENERATE TABLES AND DATA
DECLARE @Meal TABLE (ID INT);
INSERT @Meal (ID) VALUES (1), (2), (3), (4);

DECLARE @Recipe TABLE (ID INT, ID_m INT);
INSERT @Recipe (ID, ID_m) 
VALUES (1, 1), (2, 2), (3, 3), (4, 4), (5, 1), (6, 4), (7, 5), (8, 6);

DECLARE @Ingredient TABLE (ID INT, ID_r INT, AmountAndDescription VARCHAR(MAX));
INSERT @Ingredient (ID, ID_R, AmountAndDescription)
VALUES
    (1, 1, '15gr of yeast'), (2, 4, '2 eggs'),
    (3, 1, '300cl of water'), (4, 2, '300cl of beer'),
    (5, 3, '250cl of milk'), (6, 5, '100gr of biscuits'),
    (7, 2, '15gr of yeast'), (8, 1, '500gr of flour'),
    (9, 2, '500gr of flour'), (10, 2, '10gr of salt'),
    (11, 4, '15gr of yeast');


-- TEST QUERY
DECLARE @SelectedMealID INT = 1;

WITH LinkedData AS
(
    SELECT  MealID = r.ID_m,
            RecipeID = r.ID,
            Ingredient = i.AmountAndDescription
    FROM    @Recipe AS r
            INNER JOIN @Ingredient AS i
                ON i.ID_r = r.ID
)
SELECT  a.MealID,
        a.RecipeID,
        CommonIngedients = COUNT(*)
FROM    LinkedData AS a
WHERE   a.MealID != @SelectedMealID 
AND     EXISTS
        (   SELECT  1
            FROM    LinkedData AS b
            WHERE   b.Ingredient = a.Ingredient
            AND     b.MealID = @SelectedMealID 
        )
GROUP BY a.MealID, a.RecipeID;

输出

MealID  RecipeID    CommonIngedients
------------------------------------------
2       2           2
4       4           1

N.B。问题中的预期输出略有不同,但我认为问题可能包含错字(说明食谱 4 与第 3 餐相关,但样本数据中似乎并非如此)