带递归的简单代数 SQL

Simple algebra with recursive SQL

以下架构用于创建简单的代数公式。 variables 用于创建公式,例如 x=3+4yvariables_has_sub_variables 用于组合前面提到的公式,并使用 sign 列(将仅为 +1 或 -1)来确定是否应将公式添加或减去组合。

例如,variables table 可能有以下数据,其中 Implied Formulas 列实际上并不在 table 中,但仅用于说明目的。

变量 table

+-----------+-----------+-------+------------------+
| variables | intercept | slope | Implied Formula  |
+-----------+-----------+-------+------------------+
|         1 |      2.86 | -0.82 | Y1=+2.86-0.82*X1 |
|         2 |      2.96 | -3.49 | Y2=+2.96-3.49*X2 |
|         3 |      2.56 |  2.81 | Y3=+2.56+2.81*X3 |
|         4 |      3.04 | -3.43 | Y4=+3.04-3.43*X4 |
|         5 |     -1.94 |  4.11 | Y5=-1.94+4.11*X5 |
|         6 |     -1.21 | -0.62 | Y6=-1.21-0.62*X6 |
|         7 |      0.88 | -0.61 | Y7=+0.88-0.61*X7 |
|         8 |     -2.77 | -0.34 | Y8=-2.77-0.34*X8 |
|         9 |      1.81 |  1.65 | Y9=+1.81+1.65*X9 |
+-----------+-----------+-------+------------------+

然后,给定以下 variables_has_sub_variables 数据,这些变量合并后得到 X7=+Y1-Y2+Y3X8=+Y4+Y5-Y7X9=+Y6-Y7+Y8。接下来 Y7Y8Y9 可以使用 variables table 得到 Y7=+0.88-0.61*X7 等。请注意,应用程序将阻止无限循环,例如插入一条记录,其中 variables 等于 7,sub_variables 等于 9,因为变量 9 基于变量 7.

variables_has_sub_variables table

+-----------+---------------+------+
| variables | sub_variables | sign |
+-----------+---------------+------+
|         7 |             1 |    1 |
|         7 |             2 |   -1 |
|         7 |             3 |    1 |
|         8 |             4 |    1 |
|         8 |             5 |    1 |
|         8 |             7 |   -1 |
|         9 |             6 |    1 |
|         9 |             7 |   -1 |
|         9 |             8 |    1 |
+-----------+---------------+------+

我的 objective 被赋予任何变量(即 1 到 9),确定常量和根变量,其中根变量被定义为不在 variables_has_sub_variables.variables 中(我也可以轻松地 root 列到 variables(如果需要),这些根变量包括使用我上面的示例数据的 1 到 6。

对根变量这样做更容易,因为没有 sub_variables 并且只是 Y1=+2.86-0.82*X1.

对变量 7 这样做有点棘手:

Y7=+0.88-0.61*X7
     =+0.88-0.61*(+Y1-Y2+Y3)
     =+0.88-0.61*(+(+2.86-0.82*X1)-(+2.96-3.49*X2)+( +2.56+2.81*X3))
     = -0.62 + 0.50*X1 - 2.13*X2 - 1.71*X3

现在 SQL。下面是我如何创建 tables:

CREATE DATABASE algebra;
USE algebra;

CREATE TABLE `variables` (
  `variables` INT NOT NULL,
  `slope` DECIMAL(6,2) NOT NULL DEFAULT 1,
  `intercept` DECIMAL(6,2) NOT NULL DEFAULT 0,
  PRIMARY KEY (`variables`))
ENGINE = InnoDB;

CREATE TABLE `variables_has_sub_variables` (
  `variables` INT NOT NULL,
  `sub_variables` INT NOT NULL,
  `sign` TINYINT NOT NULL,
  PRIMARY KEY (`variables`, `sub_variables`),
  INDEX `fk_variables_has_variables_variables1_idx` (`sub_variables` ASC),
  INDEX `fk_variables_has_variables_variables_idx` (`variables` ASC),
  CONSTRAINT `fk_variables_has_variables_variables`
    FOREIGN KEY (`variables`)
    REFERENCES `variables` (`variables`)
    ON DELETE NO ACTION
    ON UPDATE NO ACTION,
  CONSTRAINT `fk_variables_has_variables_variables1`
    FOREIGN KEY (`sub_variables`)
    REFERENCES `variables` (`variables`)
    ON DELETE NO ACTION
    ON UPDATE NO ACTION)
ENGINE = InnoDB;

INSERT INTO variables(variables,intercept,slope) VALUES (1,2.86,-0.82),(2,2.96,-3.49),(3,2.56,2.81),(4,3.04,-3.43),(5,-1.94,4.11),(6,-1.21,-0.62),(7,0.88,-0.61),(8,-2.77,-0.34),(9,1.81,1.65);

INSERT INTO variables_has_sub_variables(variables,sub_variables,sign) VALUES (7,1,1),(7,2,-1),(7,3,1),(8,4,1),(8,5,1),(8,7,-1),(9,6,1),(9,7,-1),(9,8,1);

现在是查询。 XXXX 为以下结果的 7、8 和 9。在每次查询之前,我都会显示我的预期结果。

WITH RECURSIVE t AS (
SELECT v.variables, v.slope, v.intercept
FROM variables v
WHERE v.variables=XXXX
UNION ALL
SELECT v.variables, vhsv.sign*t.slope*v.slope slope, vhsv.sign*t.slope*v.intercept intercept
FROM t
INNER JOIN variables_has_sub_variables vhsv ON vhsv.variables=t.variables
INNER JOIN variables v ON v.variables=vhsv.sub_variables
)
SELECT variables, SUM(slope) constant FROM t GROUP BY variables
UNION SELECT 'intercept' variables, SUM(intercept) intercept FROM t;

需要变量 7

+-----------+----------+
| variables | constant |
+-----------+----------+
|         1 |     0.50 |
|         2 |    -2.13 |
|         3 |    -1.71 |
| intercept |  -0.6206 |
+-----------+----------+

变量 7 实际值

+-----------+----------+
| variables | constant |
+-----------+----------+
| 1         | 0.50     |
| 2         | -2.13    |
| 3         | -1.71    |
| 7         | -0.61    |
| intercept | -0.61    |
+-----------+----------+
5 rows in set (0.00 sec)

需要变量 8

+-----------+-----------+
| variables | constant  |
+-----------+-----------+
|         1 |      0.17 |
|         2 |     -0.72 |
|         3 |     -0.58 |
|         4 |      1.17 |
|         5 |     -1.40 |
| intercept | -3.355004 |
+-----------+-----------+

变量 8 实际值

+-----------+----------+
| variables | constant |
+-----------+----------+
| 1         | 0.17     |
| 2         | -0.73    |
| 3         | -0.59    |
| 4         | 1.17     |
| 5         | -1.40    |
| 7         | -0.21    |
| 8         | -0.34    |
| intercept | -3.36    |
+-----------+----------+
8 rows in set (0.00 sec)

需要变量 9

+-----------+------------+
| variables |  constant  |
+-----------+------------+
|         1 |      -0.54 |
|         2 |       2.32 |
|         3 |       1.87 |
|         4 |       1.92 |
|         5 |      -2.31 |
|         6 |      -1.02 |
| intercept | -4.6982666 |
+-----------+------------+

变量 9 实际值

+-----------+----------+
| variables | constant |
+-----------+----------+
| 1         | -0.55    |
| 2         | 2.33     |
| 3         | 1.88     |
| 4         | 1.92     |
| 5         | -2.30    |
| 6         | -1.02    |
| 7         | 0.67     |
| 8         | -0.56    |
| 9         | 1.65     |
| intercept | -4.67    |
+-----------+----------+
10 rows in set (0.00 sec)

我需要做的就是检测哪些变量不是根变量并过滤掉。这应该如何实现?

回应JNevill的回答: 对于 v.variables 的 9

+-----------+-------+-------+----------+
| variables | depth | path  | constant |
+-----------+-------+-------+----------+
| 1         |     3 | 9>7>1 | -0.55    |
| 2         |     3 | 9>7>2 | 2.33     |
| 3         |     3 | 9>7>3 | 1.88     |
| 4         |     3 | 9>8>4 | 1.92     |
| 5         |     3 | 9>8>5 | -2.30    |
| 6         |     2 | 9>6   | -1.02    |
| 7         |     2 | 9>7   | 0.67     |
| 8         |     2 | 9>8   | -0.56    |
| 9         |     1 | 9     | 1.65     |
| intercept |     1 | 9     | -4.67    |
+-----------+-------+-------+----------+
10 rows in set (0.00 sec)

我不会试图完全理解你正在做的事情,我同意@RickJames 在评论中的观点,这感觉可能不是数据库的最佳用例。不过我也有点强迫症。我明白了。

我几乎总是在递归 CTE 中跟踪一些事情。

  1. "Path"。如果我要让查询陷入困境,我想知道它是如何到达终点的。所以我跟踪一条路径,这样我就知道在每次迭代中选择了哪个主键。在递归种子(顶部)中,我使用类似 SELECT CAST(id as varchar(500)) as path... 的东西,在递归成员(底部)中,我使用类似 recursiveCTE.path + '>' + id as path...

  2. 的东西
  3. "Depth"。我想知道迭代多深才能得到结果记录。这是通过向递归种子添加 SELECT 1 as depth 和向递归成员添加 recursiveCTE + 1 as depth 来跟踪的。现在我知道每条记录有多深了。

我相信数字 2 会解决您的问题:

WITH RECURSIVE t
AS (
    SELECT v.variables,
        v.slope,
        v.intercept,
        1 as depth
    FROM variables v
    WHERE v.variables = XXXX

    UNION ALL

    SELECT v.variables,
        vhsv.sign * t.slope * v.slope slope,
        vhsv.sign * t.slope * v.intercept intercept, 
        t.depth + 1
    FROM t
    INNER JOIN variables_has_sub_variables vhsv ON vhsv.variables = t.variables
    INNER JOIN variables v ON v.variables = vhsv.sub_variables
    )
SELECT variables,
    SUM(slope) constant
FROM t
WHERE depth > 1
GROUP BY variables

UNION

SELECT 'intercept' variables,
    SUM(intercept) intercept
FROM t;

此处的 WHERE 子句将限制递归结果集中深度为 1 的记录,这意味着它们是从递归 CTE 的递归种子部分引入的(即它们是根)。

不清楚您是否需要从 t CTE 的第二个 UNION 中删除根。如果是这样,同样的逻辑适用;只需使用 WHERE 子句来限制 1

depth 记录

虽然它在这里可能没有帮助,但使用 PATH 的递归 cte 示例是:

WITH RECURSIVE t
AS (
    SELECT v.variables,
        v.slope,
        v.intercept,
        1 as depth,
        CAST(v.variables as CHAR(30)) as path
    FROM variables v
    WHERE v.variables = XXXX

    UNION ALL

    SELECT v.variables,
        vhsv.sign * t.slope * v.slope slope,
        vhsv.sign * t.slope * v.intercept intercept, 
        t.depth + 1,
        CONCAT(t.path,'>', v.variables)
    FROM t
    INNER JOIN variables_has_sub_variables vhsv ON vhsv.variables = t.variables
    INNER JOIN variables v ON v.variables = vhsv.sub_variables
    )
SELECT variables,
    SUM(slope) constant
FROM t
WHERE depth > 1
GROUP BY variables

UNION

SELECT 'intercept' variables,
    SUM(intercept) intercept
FROM t;