在 MySQL 中检索分层数据中的树

Retrieving Tree in Hierarchical data in MySQL

我存储了一些类别的分层数据,其中每个类别都与其他类别相关,诀窍是一个类别可以有多个 parents(最多 3 个,最小 0 个)。

table 结构是:

类别 table id - 主键

名称 - 类别名称

ref_id - 用于关系的参考 ID

id name ref_id
1 everything -1
2 computing 0
3 artificial intelligence 1
4 data science 2
5 machine learning (ML) 3
6 programming 4
7 web technologies 5
8 programming languages 7
9 content technologies 8
10 operating systems 9
11 algorithms 10
12 software development systems 102

category_relation table

id child_ref_id parent_ref_id
1 0 -1
2 1 0
3 2 0
4 3 1
5 3 2
6 4 102
7 5 0
8 7 4
9 8 0
10 9 0
11 10 0
12 10 4
13 102 0

如图所示,关系相当复杂,algorithms 有两个 parents computing编程,同样机器学习(ML)也有两个parents人工智能数据科学

如何检索特定类别的所有 children,例如计算,我需要检索所有 children 直到第三级,即编程语言和算法。

MySQL 数据库转储:https://github.com/codersrb/multi-parent-hierarchy/blob/main/taxonomy.sql

假设数据结构固定好PK,在MySQL8.x你可以做:

with recursive
n (id, name, ref_id, lvl) as (
  select id, name, ref_id, 1 from category where id = 2 -- starting node
 union all
  select c.id, c.name, c.ref_id, n.lvl + 1
  from n
  join category_relation r on r.parent_ref_id = n.ref_id
  join category c on c.ref_id = r.child_ref_id
)
select * from n where lvl <= 3

结果:

 id   name                                    ref_id  lvl 
 ---- --------------------------------------- ------- --- 
 2    computing                               0       1   
 3    artificial intelligence                 1       2   
 4    data science                            2       2   
 7    web technologies                        5       2   
 9    content technologies                    8       2   
 10   operating systems                       9       2   
 11   algorithms                              10      2   
 62   information science                     61      2   
 103  software / systems development          102     2   
 165  scientific computing                    165     2   
 296  image processing                        316     2   
 297  text processing                         317     2   
 301  Google                                  321     2   
 322  computer vision                         343     2   
 5    machine learning (ML)                   3       3   
 5    machine learning (ML)                   3       3   
 6    programming                             4       3   
 18   models                                  17      3   
 21   classification                          20      3   
 27   data preparation                        26      3   
 28   data analysis                           27      3   
 29   imbalanced datasets                     28      3   
 50   visualization                           49      3   
 61   information retrieval                   60      3   
 68   k-means                                 67      3   
 71   Random Forest algorithm                 70      3   
 104  project management                      103     3   
 105  software development methodologies      104     3   
 107  web development                         106     3   
 113  kNN model                               112     3   
 132  CRISP-DM methodology                    131     3   
 143  data                                    142     3   
 153  SMOTE                                   153     3   
 154  MSMOTE                                  154     3   
 157  backward feature elimination            157     3   
 158  forward feature selection               158     3   
 176  deep feature synthesis (DFS)            177     3   
 196  unsupervised learning                   197     3   
 210  mean-shift                              211     3   
 212  DBSCAN                                  213     3   
 246  naïve Bayes algorithm                   247     3   
 248  decision tree algorithm                 249     3   
 249  support vector machine (SVM) algorithm  250     3   
 251  neural networks                         252     3   
 252  artificial neural networks (ANN)        253     3   
 281  deep learning                           300     3   
 281  deep learning                           300     3   
 285  image classification                    304     3   
 285  image classification                    304     3   
 286  natural language processing (NLP)       305     3   
 286  natural language processing (NLP)       305     3   
 288  text representation                     307     3   
 294  visual recognition                      314     3   
 295  optical character recognition (OCR)     315     3   
 295  optical character recognition (OCR)     315     3   
 296  image processing                        316     3   
 298  machine translation (MT)                318     3   
 299  speech recognition                      319     3   
 300  TensorFlow                              320     3   
 302  R                                       322     3   
 304  Android                                 324     3   
 322  computer vision                         343     3   
 323  object detection                        344     3   
 324  instance segmentation                   345     3   
 325  edge detection                          346     3   
 326  image filters                           347     3   
 327  feature maps                            348     3   
 328  stride                                  349     3   
 329  padding                                 350     3   
 335  text preprocessing                      356     3   
 336  tokenization                            357     3   
 337  case normalization                      358     3   
 338  removing punctuation                    359     3   
 339  stop words                              360     3   
 340  stemming                                361     3   
 341  lemmatization                           362     3   
 342  Porter algorithm                        363     3   
 350  word2vec                                371     3   
 351  Skip-gram                               372     3   
 364  convnets                                385     3   
 404  multiplicative update algorithm         716     3   

如果要删除重复项,可以使用 DISTINCT。例如:

with recursive
n (id, name, ref_id, lvl) as (
  select id, name, ref_id, 1 from category where id = 2 -- starting node
 union all
  select c.id, c.name, c.ref_id, n.lvl + 1
  from n
  join category_relation r on r.parent_ref_id = n.ref_id
  join category c on c.ref_id = r.child_ref_id
)
select distinct * from n where lvl <= 3

参见 DB Fiddle 中的 运行 示例。