在 MySQL 中检索分层数据中的树
Retrieving Tree in Hierarchical data in MySQL
我存储了一些类别的分层数据,其中每个类别都与其他类别相关,诀窍是一个类别可以有多个 parents(最多 3 个,最小 0 个)。
table 结构是:
类别 table
id - 主键
名称 - 类别名称
ref_id - 用于关系的参考 ID
id
name
ref_id
1
everything
-1
2
computing
0
3
artificial intelligence
1
4
data science
2
5
machine learning (ML)
3
6
programming
4
7
web technologies
5
8
programming languages
7
9
content technologies
8
10
operating systems
9
11
algorithms
10
12
software development systems
102
category_relation table
id
child_ref_id
parent_ref_id
1
0
-1
2
1
0
3
2
0
4
3
1
5
3
2
6
4
102
7
5
0
8
7
4
9
8
0
10
9
0
11
10
0
12
10
4
13
102
0
如图所示,关系相当复杂,algorithms 有两个 parents computing 和 编程,同样机器学习(ML)也有两个parents人工智能和数据科学
如何检索特定类别的所有 children,例如计算,我需要检索所有 children 直到第三级,即编程语言和算法。
MySQL 数据库转储:https://github.com/codersrb/multi-parent-hierarchy/blob/main/taxonomy.sql
假设数据结构固定好PK,在MySQL8.x你可以做:
with recursive
n (id, name, ref_id, lvl) as (
select id, name, ref_id, 1 from category where id = 2 -- starting node
union all
select c.id, c.name, c.ref_id, n.lvl + 1
from n
join category_relation r on r.parent_ref_id = n.ref_id
join category c on c.ref_id = r.child_ref_id
)
select * from n where lvl <= 3
结果:
id name ref_id lvl
---- --------------------------------------- ------- ---
2 computing 0 1
3 artificial intelligence 1 2
4 data science 2 2
7 web technologies 5 2
9 content technologies 8 2
10 operating systems 9 2
11 algorithms 10 2
62 information science 61 2
103 software / systems development 102 2
165 scientific computing 165 2
296 image processing 316 2
297 text processing 317 2
301 Google 321 2
322 computer vision 343 2
5 machine learning (ML) 3 3
5 machine learning (ML) 3 3
6 programming 4 3
18 models 17 3
21 classification 20 3
27 data preparation 26 3
28 data analysis 27 3
29 imbalanced datasets 28 3
50 visualization 49 3
61 information retrieval 60 3
68 k-means 67 3
71 Random Forest algorithm 70 3
104 project management 103 3
105 software development methodologies 104 3
107 web development 106 3
113 kNN model 112 3
132 CRISP-DM methodology 131 3
143 data 142 3
153 SMOTE 153 3
154 MSMOTE 154 3
157 backward feature elimination 157 3
158 forward feature selection 158 3
176 deep feature synthesis (DFS) 177 3
196 unsupervised learning 197 3
210 mean-shift 211 3
212 DBSCAN 213 3
246 naïve Bayes algorithm 247 3
248 decision tree algorithm 249 3
249 support vector machine (SVM) algorithm 250 3
251 neural networks 252 3
252 artificial neural networks (ANN) 253 3
281 deep learning 300 3
281 deep learning 300 3
285 image classification 304 3
285 image classification 304 3
286 natural language processing (NLP) 305 3
286 natural language processing (NLP) 305 3
288 text representation 307 3
294 visual recognition 314 3
295 optical character recognition (OCR) 315 3
295 optical character recognition (OCR) 315 3
296 image processing 316 3
298 machine translation (MT) 318 3
299 speech recognition 319 3
300 TensorFlow 320 3
302 R 322 3
304 Android 324 3
322 computer vision 343 3
323 object detection 344 3
324 instance segmentation 345 3
325 edge detection 346 3
326 image filters 347 3
327 feature maps 348 3
328 stride 349 3
329 padding 350 3
335 text preprocessing 356 3
336 tokenization 357 3
337 case normalization 358 3
338 removing punctuation 359 3
339 stop words 360 3
340 stemming 361 3
341 lemmatization 362 3
342 Porter algorithm 363 3
350 word2vec 371 3
351 Skip-gram 372 3
364 convnets 385 3
404 multiplicative update algorithm 716 3
如果要删除重复项,可以使用 DISTINCT
。例如:
with recursive
n (id, name, ref_id, lvl) as (
select id, name, ref_id, 1 from category where id = 2 -- starting node
union all
select c.id, c.name, c.ref_id, n.lvl + 1
from n
join category_relation r on r.parent_ref_id = n.ref_id
join category c on c.ref_id = r.child_ref_id
)
select distinct * from n where lvl <= 3
参见 DB Fiddle 中的 运行 示例。
我存储了一些类别的分层数据,其中每个类别都与其他类别相关,诀窍是一个类别可以有多个 parents(最多 3 个,最小 0 个)。
table 结构是:
类别 table id - 主键
名称 - 类别名称
ref_id - 用于关系的参考 ID
id | name | ref_id |
---|---|---|
1 | everything | -1 |
2 | computing | 0 |
3 | artificial intelligence | 1 |
4 | data science | 2 |
5 | machine learning (ML) | 3 |
6 | programming | 4 |
7 | web technologies | 5 |
8 | programming languages | 7 |
9 | content technologies | 8 |
10 | operating systems | 9 |
11 | algorithms | 10 |
12 | software development systems | 102 |
category_relation table
id | child_ref_id | parent_ref_id |
---|---|---|
1 | 0 | -1 |
2 | 1 | 0 |
3 | 2 | 0 |
4 | 3 | 1 |
5 | 3 | 2 |
6 | 4 | 102 |
7 | 5 | 0 |
8 | 7 | 4 |
9 | 8 | 0 |
10 | 9 | 0 |
11 | 10 | 0 |
12 | 10 | 4 |
13 | 102 | 0 |
如图所示,关系相当复杂,algorithms 有两个 parents computing 和 编程,同样机器学习(ML)也有两个parents人工智能和数据科学
如何检索特定类别的所有 children,例如计算,我需要检索所有 children 直到第三级,即编程语言和算法。
MySQL 数据库转储:https://github.com/codersrb/multi-parent-hierarchy/blob/main/taxonomy.sql
假设数据结构固定好PK,在MySQL8.x你可以做:
with recursive
n (id, name, ref_id, lvl) as (
select id, name, ref_id, 1 from category where id = 2 -- starting node
union all
select c.id, c.name, c.ref_id, n.lvl + 1
from n
join category_relation r on r.parent_ref_id = n.ref_id
join category c on c.ref_id = r.child_ref_id
)
select * from n where lvl <= 3
结果:
id name ref_id lvl
---- --------------------------------------- ------- ---
2 computing 0 1
3 artificial intelligence 1 2
4 data science 2 2
7 web technologies 5 2
9 content technologies 8 2
10 operating systems 9 2
11 algorithms 10 2
62 information science 61 2
103 software / systems development 102 2
165 scientific computing 165 2
296 image processing 316 2
297 text processing 317 2
301 Google 321 2
322 computer vision 343 2
5 machine learning (ML) 3 3
5 machine learning (ML) 3 3
6 programming 4 3
18 models 17 3
21 classification 20 3
27 data preparation 26 3
28 data analysis 27 3
29 imbalanced datasets 28 3
50 visualization 49 3
61 information retrieval 60 3
68 k-means 67 3
71 Random Forest algorithm 70 3
104 project management 103 3
105 software development methodologies 104 3
107 web development 106 3
113 kNN model 112 3
132 CRISP-DM methodology 131 3
143 data 142 3
153 SMOTE 153 3
154 MSMOTE 154 3
157 backward feature elimination 157 3
158 forward feature selection 158 3
176 deep feature synthesis (DFS) 177 3
196 unsupervised learning 197 3
210 mean-shift 211 3
212 DBSCAN 213 3
246 naïve Bayes algorithm 247 3
248 decision tree algorithm 249 3
249 support vector machine (SVM) algorithm 250 3
251 neural networks 252 3
252 artificial neural networks (ANN) 253 3
281 deep learning 300 3
281 deep learning 300 3
285 image classification 304 3
285 image classification 304 3
286 natural language processing (NLP) 305 3
286 natural language processing (NLP) 305 3
288 text representation 307 3
294 visual recognition 314 3
295 optical character recognition (OCR) 315 3
295 optical character recognition (OCR) 315 3
296 image processing 316 3
298 machine translation (MT) 318 3
299 speech recognition 319 3
300 TensorFlow 320 3
302 R 322 3
304 Android 324 3
322 computer vision 343 3
323 object detection 344 3
324 instance segmentation 345 3
325 edge detection 346 3
326 image filters 347 3
327 feature maps 348 3
328 stride 349 3
329 padding 350 3
335 text preprocessing 356 3
336 tokenization 357 3
337 case normalization 358 3
338 removing punctuation 359 3
339 stop words 360 3
340 stemming 361 3
341 lemmatization 362 3
342 Porter algorithm 363 3
350 word2vec 371 3
351 Skip-gram 372 3
364 convnets 385 3
404 multiplicative update algorithm 716 3
如果要删除重复项,可以使用 DISTINCT
。例如:
with recursive
n (id, name, ref_id, lvl) as (
select id, name, ref_id, 1 from category where id = 2 -- starting node
union all
select c.id, c.name, c.ref_id, n.lvl + 1
from n
join category_relation r on r.parent_ref_id = n.ref_id
join category c on c.ref_id = r.child_ref_id
)
select distinct * from n where lvl <= 3
参见 DB Fiddle 中的 运行 示例。