带递归的简单代数 SQL
Simple algebra with recursive SQL
以下架构用于创建简单的代数公式。 variables
用于创建公式,例如 x=3+4y
。 variables_has_sub_variables
用于组合前面提到的公式,并使用 sign
列(将仅为 +1 或 -1)来确定是否应将公式添加或减去组合。
例如,variables
table 可能有以下数据,其中 Implied Formulas
列实际上并不在 table 中,但仅用于说明目的。
变量 table
+-----------+-----------+-------+------------------+
| variables | intercept | slope | Implied Formula |
+-----------+-----------+-------+------------------+
| 1 | 2.86 | -0.82 | Y1=+2.86-0.82*X1 |
| 2 | 2.96 | -3.49 | Y2=+2.96-3.49*X2 |
| 3 | 2.56 | 2.81 | Y3=+2.56+2.81*X3 |
| 4 | 3.04 | -3.43 | Y4=+3.04-3.43*X4 |
| 5 | -1.94 | 4.11 | Y5=-1.94+4.11*X5 |
| 6 | -1.21 | -0.62 | Y6=-1.21-0.62*X6 |
| 7 | 0.88 | -0.61 | Y7=+0.88-0.61*X7 |
| 8 | -2.77 | -0.34 | Y8=-2.77-0.34*X8 |
| 9 | 1.81 | 1.65 | Y9=+1.81+1.65*X9 |
+-----------+-----------+-------+------------------+
然后,给定以下 variables_has_sub_variables
数据,这些变量合并后得到 X7=+Y1-Y2+Y3
、X8=+Y4+Y5-Y7
和 X9=+Y6-Y7+Y8
。接下来 Y7
、Y8
和 Y9
可以使用 variables
table 得到 Y7=+0.88-0.61*X7
等。请注意,应用程序将阻止无限循环,例如插入一条记录,其中 variables
等于 7,sub_variables
等于 9,因为变量 9 基于变量 7.
variables_has_sub_variables table
+-----------+---------------+------+
| variables | sub_variables | sign |
+-----------+---------------+------+
| 7 | 1 | 1 |
| 7 | 2 | -1 |
| 7 | 3 | 1 |
| 8 | 4 | 1 |
| 8 | 5 | 1 |
| 8 | 7 | -1 |
| 9 | 6 | 1 |
| 9 | 7 | -1 |
| 9 | 8 | 1 |
+-----------+---------------+------+
我的 objective 被赋予任何变量(即 1 到 9),确定常量和根变量,其中根变量被定义为不在 variables_has_sub_variables.variables
中(我也可以轻松地 root
列到 variables
(如果需要),这些根变量包括使用我上面的示例数据的 1 到 6。
对根变量这样做更容易,因为没有 sub_variables 并且只是 Y1=+2.86-0.82*X1
.
对变量 7 这样做有点棘手:
Y7=+0.88-0.61*X7
=+0.88-0.61*(+Y1-Y2+Y3)
=+0.88-0.61*(+(+2.86-0.82*X1)-(+2.96-3.49*X2)+( +2.56+2.81*X3))
= -0.62 + 0.50*X1 - 2.13*X2 - 1.71*X3
现在 SQL。下面是我如何创建 tables:
CREATE DATABASE algebra;
USE algebra;
CREATE TABLE `variables` (
`variables` INT NOT NULL,
`slope` DECIMAL(6,2) NOT NULL DEFAULT 1,
`intercept` DECIMAL(6,2) NOT NULL DEFAULT 0,
PRIMARY KEY (`variables`))
ENGINE = InnoDB;
CREATE TABLE `variables_has_sub_variables` (
`variables` INT NOT NULL,
`sub_variables` INT NOT NULL,
`sign` TINYINT NOT NULL,
PRIMARY KEY (`variables`, `sub_variables`),
INDEX `fk_variables_has_variables_variables1_idx` (`sub_variables` ASC),
INDEX `fk_variables_has_variables_variables_idx` (`variables` ASC),
CONSTRAINT `fk_variables_has_variables_variables`
FOREIGN KEY (`variables`)
REFERENCES `variables` (`variables`)
ON DELETE NO ACTION
ON UPDATE NO ACTION,
CONSTRAINT `fk_variables_has_variables_variables1`
FOREIGN KEY (`sub_variables`)
REFERENCES `variables` (`variables`)
ON DELETE NO ACTION
ON UPDATE NO ACTION)
ENGINE = InnoDB;
INSERT INTO variables(variables,intercept,slope) VALUES (1,2.86,-0.82),(2,2.96,-3.49),(3,2.56,2.81),(4,3.04,-3.43),(5,-1.94,4.11),(6,-1.21,-0.62),(7,0.88,-0.61),(8,-2.77,-0.34),(9,1.81,1.65);
INSERT INTO variables_has_sub_variables(variables,sub_variables,sign) VALUES (7,1,1),(7,2,-1),(7,3,1),(8,4,1),(8,5,1),(8,7,-1),(9,6,1),(9,7,-1),(9,8,1);
现在是查询。 XXXX
为以下结果的 7、8 和 9。在每次查询之前,我都会显示我的预期结果。
WITH RECURSIVE t AS (
SELECT v.variables, v.slope, v.intercept
FROM variables v
WHERE v.variables=XXXX
UNION ALL
SELECT v.variables, vhsv.sign*t.slope*v.slope slope, vhsv.sign*t.slope*v.intercept intercept
FROM t
INNER JOIN variables_has_sub_variables vhsv ON vhsv.variables=t.variables
INNER JOIN variables v ON v.variables=vhsv.sub_variables
)
SELECT variables, SUM(slope) constant FROM t GROUP BY variables
UNION SELECT 'intercept' variables, SUM(intercept) intercept FROM t;
需要变量 7
+-----------+----------+
| variables | constant |
+-----------+----------+
| 1 | 0.50 |
| 2 | -2.13 |
| 3 | -1.71 |
| intercept | -0.6206 |
+-----------+----------+
变量 7 实际值
+-----------+----------+
| variables | constant |
+-----------+----------+
| 1 | 0.50 |
| 2 | -2.13 |
| 3 | -1.71 |
| 7 | -0.61 |
| intercept | -0.61 |
+-----------+----------+
5 rows in set (0.00 sec)
需要变量 8
+-----------+-----------+
| variables | constant |
+-----------+-----------+
| 1 | 0.17 |
| 2 | -0.72 |
| 3 | -0.58 |
| 4 | 1.17 |
| 5 | -1.40 |
| intercept | -3.355004 |
+-----------+-----------+
变量 8 实际值
+-----------+----------+
| variables | constant |
+-----------+----------+
| 1 | 0.17 |
| 2 | -0.73 |
| 3 | -0.59 |
| 4 | 1.17 |
| 5 | -1.40 |
| 7 | -0.21 |
| 8 | -0.34 |
| intercept | -3.36 |
+-----------+----------+
8 rows in set (0.00 sec)
需要变量 9
+-----------+------------+
| variables | constant |
+-----------+------------+
| 1 | -0.54 |
| 2 | 2.32 |
| 3 | 1.87 |
| 4 | 1.92 |
| 5 | -2.31 |
| 6 | -1.02 |
| intercept | -4.6982666 |
+-----------+------------+
变量 9 实际值
+-----------+----------+
| variables | constant |
+-----------+----------+
| 1 | -0.55 |
| 2 | 2.33 |
| 3 | 1.88 |
| 4 | 1.92 |
| 5 | -2.30 |
| 6 | -1.02 |
| 7 | 0.67 |
| 8 | -0.56 |
| 9 | 1.65 |
| intercept | -4.67 |
+-----------+----------+
10 rows in set (0.00 sec)
我需要做的就是检测哪些变量不是根变量并过滤掉。这应该如何实现?
回应JNevill的回答:
对于 v.variables 的 9
+-----------+-------+-------+----------+
| variables | depth | path | constant |
+-----------+-------+-------+----------+
| 1 | 3 | 9>7>1 | -0.55 |
| 2 | 3 | 9>7>2 | 2.33 |
| 3 | 3 | 9>7>3 | 1.88 |
| 4 | 3 | 9>8>4 | 1.92 |
| 5 | 3 | 9>8>5 | -2.30 |
| 6 | 2 | 9>6 | -1.02 |
| 7 | 2 | 9>7 | 0.67 |
| 8 | 2 | 9>8 | -0.56 |
| 9 | 1 | 9 | 1.65 |
| intercept | 1 | 9 | -4.67 |
+-----------+-------+-------+----------+
10 rows in set (0.00 sec)
我不会试图完全理解你正在做的事情,我同意@RickJames 在评论中的观点,这感觉可能不是数据库的最佳用例。不过我也有点强迫症。我明白了。
我几乎总是在递归 CTE 中跟踪一些事情。
"Path"。如果我要让查询陷入困境,我想知道它是如何到达终点的。所以我跟踪一条路径,这样我就知道在每次迭代中选择了哪个主键。在递归种子(顶部)中,我使用类似 SELECT CAST(id as varchar(500)) as path...
的东西,在递归成员(底部)中,我使用类似 recursiveCTE.path + '>' + id as path...
的东西
"Depth"。我想知道迭代多深才能得到结果记录。这是通过向递归种子添加 SELECT 1 as depth
和向递归成员添加 recursiveCTE + 1 as depth
来跟踪的。现在我知道每条记录有多深了。
我相信数字 2 会解决您的问题:
WITH RECURSIVE t
AS (
SELECT v.variables,
v.slope,
v.intercept,
1 as depth
FROM variables v
WHERE v.variables = XXXX
UNION ALL
SELECT v.variables,
vhsv.sign * t.slope * v.slope slope,
vhsv.sign * t.slope * v.intercept intercept,
t.depth + 1
FROM t
INNER JOIN variables_has_sub_variables vhsv ON vhsv.variables = t.variables
INNER JOIN variables v ON v.variables = vhsv.sub_variables
)
SELECT variables,
SUM(slope) constant
FROM t
WHERE depth > 1
GROUP BY variables
UNION
SELECT 'intercept' variables,
SUM(intercept) intercept
FROM t;
此处的 WHERE 子句将限制递归结果集中深度为 1 的记录,这意味着它们是从递归 CTE 的递归种子部分引入的(即它们是根)。
不清楚您是否需要从 t
CTE 的第二个 UNION 中删除根。如果是这样,同样的逻辑适用;只需使用 WHERE 子句来限制 1
的 depth
记录
虽然它在这里可能没有帮助,但使用 PATH
的递归 cte 示例是:
WITH RECURSIVE t
AS (
SELECT v.variables,
v.slope,
v.intercept,
1 as depth,
CAST(v.variables as CHAR(30)) as path
FROM variables v
WHERE v.variables = XXXX
UNION ALL
SELECT v.variables,
vhsv.sign * t.slope * v.slope slope,
vhsv.sign * t.slope * v.intercept intercept,
t.depth + 1,
CONCAT(t.path,'>', v.variables)
FROM t
INNER JOIN variables_has_sub_variables vhsv ON vhsv.variables = t.variables
INNER JOIN variables v ON v.variables = vhsv.sub_variables
)
SELECT variables,
SUM(slope) constant
FROM t
WHERE depth > 1
GROUP BY variables
UNION
SELECT 'intercept' variables,
SUM(intercept) intercept
FROM t;
以下架构用于创建简单的代数公式。 variables
用于创建公式,例如 x=3+4y
。 variables_has_sub_variables
用于组合前面提到的公式,并使用 sign
列(将仅为 +1 或 -1)来确定是否应将公式添加或减去组合。
例如,variables
table 可能有以下数据,其中 Implied Formulas
列实际上并不在 table 中,但仅用于说明目的。
变量 table
+-----------+-----------+-------+------------------+
| variables | intercept | slope | Implied Formula |
+-----------+-----------+-------+------------------+
| 1 | 2.86 | -0.82 | Y1=+2.86-0.82*X1 |
| 2 | 2.96 | -3.49 | Y2=+2.96-3.49*X2 |
| 3 | 2.56 | 2.81 | Y3=+2.56+2.81*X3 |
| 4 | 3.04 | -3.43 | Y4=+3.04-3.43*X4 |
| 5 | -1.94 | 4.11 | Y5=-1.94+4.11*X5 |
| 6 | -1.21 | -0.62 | Y6=-1.21-0.62*X6 |
| 7 | 0.88 | -0.61 | Y7=+0.88-0.61*X7 |
| 8 | -2.77 | -0.34 | Y8=-2.77-0.34*X8 |
| 9 | 1.81 | 1.65 | Y9=+1.81+1.65*X9 |
+-----------+-----------+-------+------------------+
然后,给定以下 variables_has_sub_variables
数据,这些变量合并后得到 X7=+Y1-Y2+Y3
、X8=+Y4+Y5-Y7
和 X9=+Y6-Y7+Y8
。接下来 Y7
、Y8
和 Y9
可以使用 variables
table 得到 Y7=+0.88-0.61*X7
等。请注意,应用程序将阻止无限循环,例如插入一条记录,其中 variables
等于 7,sub_variables
等于 9,因为变量 9 基于变量 7.
variables_has_sub_variables table
+-----------+---------------+------+
| variables | sub_variables | sign |
+-----------+---------------+------+
| 7 | 1 | 1 |
| 7 | 2 | -1 |
| 7 | 3 | 1 |
| 8 | 4 | 1 |
| 8 | 5 | 1 |
| 8 | 7 | -1 |
| 9 | 6 | 1 |
| 9 | 7 | -1 |
| 9 | 8 | 1 |
+-----------+---------------+------+
我的 objective 被赋予任何变量(即 1 到 9),确定常量和根变量,其中根变量被定义为不在 variables_has_sub_variables.variables
中(我也可以轻松地 root
列到 variables
(如果需要),这些根变量包括使用我上面的示例数据的 1 到 6。
对根变量这样做更容易,因为没有 sub_variables 并且只是 Y1=+2.86-0.82*X1
.
对变量 7 这样做有点棘手:
Y7=+0.88-0.61*X7
=+0.88-0.61*(+Y1-Y2+Y3)
=+0.88-0.61*(+(+2.86-0.82*X1)-(+2.96-3.49*X2)+( +2.56+2.81*X3))
= -0.62 + 0.50*X1 - 2.13*X2 - 1.71*X3
现在 SQL。下面是我如何创建 tables:
CREATE DATABASE algebra;
USE algebra;
CREATE TABLE `variables` (
`variables` INT NOT NULL,
`slope` DECIMAL(6,2) NOT NULL DEFAULT 1,
`intercept` DECIMAL(6,2) NOT NULL DEFAULT 0,
PRIMARY KEY (`variables`))
ENGINE = InnoDB;
CREATE TABLE `variables_has_sub_variables` (
`variables` INT NOT NULL,
`sub_variables` INT NOT NULL,
`sign` TINYINT NOT NULL,
PRIMARY KEY (`variables`, `sub_variables`),
INDEX `fk_variables_has_variables_variables1_idx` (`sub_variables` ASC),
INDEX `fk_variables_has_variables_variables_idx` (`variables` ASC),
CONSTRAINT `fk_variables_has_variables_variables`
FOREIGN KEY (`variables`)
REFERENCES `variables` (`variables`)
ON DELETE NO ACTION
ON UPDATE NO ACTION,
CONSTRAINT `fk_variables_has_variables_variables1`
FOREIGN KEY (`sub_variables`)
REFERENCES `variables` (`variables`)
ON DELETE NO ACTION
ON UPDATE NO ACTION)
ENGINE = InnoDB;
INSERT INTO variables(variables,intercept,slope) VALUES (1,2.86,-0.82),(2,2.96,-3.49),(3,2.56,2.81),(4,3.04,-3.43),(5,-1.94,4.11),(6,-1.21,-0.62),(7,0.88,-0.61),(8,-2.77,-0.34),(9,1.81,1.65);
INSERT INTO variables_has_sub_variables(variables,sub_variables,sign) VALUES (7,1,1),(7,2,-1),(7,3,1),(8,4,1),(8,5,1),(8,7,-1),(9,6,1),(9,7,-1),(9,8,1);
现在是查询。 XXXX
为以下结果的 7、8 和 9。在每次查询之前,我都会显示我的预期结果。
WITH RECURSIVE t AS (
SELECT v.variables, v.slope, v.intercept
FROM variables v
WHERE v.variables=XXXX
UNION ALL
SELECT v.variables, vhsv.sign*t.slope*v.slope slope, vhsv.sign*t.slope*v.intercept intercept
FROM t
INNER JOIN variables_has_sub_variables vhsv ON vhsv.variables=t.variables
INNER JOIN variables v ON v.variables=vhsv.sub_variables
)
SELECT variables, SUM(slope) constant FROM t GROUP BY variables
UNION SELECT 'intercept' variables, SUM(intercept) intercept FROM t;
需要变量 7
+-----------+----------+
| variables | constant |
+-----------+----------+
| 1 | 0.50 |
| 2 | -2.13 |
| 3 | -1.71 |
| intercept | -0.6206 |
+-----------+----------+
变量 7 实际值
+-----------+----------+
| variables | constant |
+-----------+----------+
| 1 | 0.50 |
| 2 | -2.13 |
| 3 | -1.71 |
| 7 | -0.61 |
| intercept | -0.61 |
+-----------+----------+
5 rows in set (0.00 sec)
需要变量 8
+-----------+-----------+
| variables | constant |
+-----------+-----------+
| 1 | 0.17 |
| 2 | -0.72 |
| 3 | -0.58 |
| 4 | 1.17 |
| 5 | -1.40 |
| intercept | -3.355004 |
+-----------+-----------+
变量 8 实际值
+-----------+----------+
| variables | constant |
+-----------+----------+
| 1 | 0.17 |
| 2 | -0.73 |
| 3 | -0.59 |
| 4 | 1.17 |
| 5 | -1.40 |
| 7 | -0.21 |
| 8 | -0.34 |
| intercept | -3.36 |
+-----------+----------+
8 rows in set (0.00 sec)
需要变量 9
+-----------+------------+
| variables | constant |
+-----------+------------+
| 1 | -0.54 |
| 2 | 2.32 |
| 3 | 1.87 |
| 4 | 1.92 |
| 5 | -2.31 |
| 6 | -1.02 |
| intercept | -4.6982666 |
+-----------+------------+
变量 9 实际值
+-----------+----------+
| variables | constant |
+-----------+----------+
| 1 | -0.55 |
| 2 | 2.33 |
| 3 | 1.88 |
| 4 | 1.92 |
| 5 | -2.30 |
| 6 | -1.02 |
| 7 | 0.67 |
| 8 | -0.56 |
| 9 | 1.65 |
| intercept | -4.67 |
+-----------+----------+
10 rows in set (0.00 sec)
我需要做的就是检测哪些变量不是根变量并过滤掉。这应该如何实现?
回应JNevill的回答: 对于 v.variables 的 9
+-----------+-------+-------+----------+
| variables | depth | path | constant |
+-----------+-------+-------+----------+
| 1 | 3 | 9>7>1 | -0.55 |
| 2 | 3 | 9>7>2 | 2.33 |
| 3 | 3 | 9>7>3 | 1.88 |
| 4 | 3 | 9>8>4 | 1.92 |
| 5 | 3 | 9>8>5 | -2.30 |
| 6 | 2 | 9>6 | -1.02 |
| 7 | 2 | 9>7 | 0.67 |
| 8 | 2 | 9>8 | -0.56 |
| 9 | 1 | 9 | 1.65 |
| intercept | 1 | 9 | -4.67 |
+-----------+-------+-------+----------+
10 rows in set (0.00 sec)
我不会试图完全理解你正在做的事情,我同意@RickJames 在评论中的观点,这感觉可能不是数据库的最佳用例。不过我也有点强迫症。我明白了。
我几乎总是在递归 CTE 中跟踪一些事情。
"Path"。如果我要让查询陷入困境,我想知道它是如何到达终点的。所以我跟踪一条路径,这样我就知道在每次迭代中选择了哪个主键。在递归种子(顶部)中,我使用类似
SELECT CAST(id as varchar(500)) as path...
的东西,在递归成员(底部)中,我使用类似recursiveCTE.path + '>' + id as path...
的东西
"Depth"。我想知道迭代多深才能得到结果记录。这是通过向递归种子添加
SELECT 1 as depth
和向递归成员添加recursiveCTE + 1 as depth
来跟踪的。现在我知道每条记录有多深了。
我相信数字 2 会解决您的问题:
WITH RECURSIVE t
AS (
SELECT v.variables,
v.slope,
v.intercept,
1 as depth
FROM variables v
WHERE v.variables = XXXX
UNION ALL
SELECT v.variables,
vhsv.sign * t.slope * v.slope slope,
vhsv.sign * t.slope * v.intercept intercept,
t.depth + 1
FROM t
INNER JOIN variables_has_sub_variables vhsv ON vhsv.variables = t.variables
INNER JOIN variables v ON v.variables = vhsv.sub_variables
)
SELECT variables,
SUM(slope) constant
FROM t
WHERE depth > 1
GROUP BY variables
UNION
SELECT 'intercept' variables,
SUM(intercept) intercept
FROM t;
此处的 WHERE 子句将限制递归结果集中深度为 1 的记录,这意味着它们是从递归 CTE 的递归种子部分引入的(即它们是根)。
不清楚您是否需要从 t
CTE 的第二个 UNION 中删除根。如果是这样,同样的逻辑适用;只需使用 WHERE 子句来限制 1
depth
记录
虽然它在这里可能没有帮助,但使用 PATH
的递归 cte 示例是:
WITH RECURSIVE t
AS (
SELECT v.variables,
v.slope,
v.intercept,
1 as depth,
CAST(v.variables as CHAR(30)) as path
FROM variables v
WHERE v.variables = XXXX
UNION ALL
SELECT v.variables,
vhsv.sign * t.slope * v.slope slope,
vhsv.sign * t.slope * v.intercept intercept,
t.depth + 1,
CONCAT(t.path,'>', v.variables)
FROM t
INNER JOIN variables_has_sub_variables vhsv ON vhsv.variables = t.variables
INNER JOIN variables v ON v.variables = vhsv.sub_variables
)
SELECT variables,
SUM(slope) constant
FROM t
WHERE depth > 1
GROUP BY variables
UNION
SELECT 'intercept' variables,
SUM(intercept) intercept
FROM t;