解析 SQL 查询与 python 连接
Parsing SQL query joins with python
我正在尝试解析 sql 查询。我正在使用 [moz-sql-parser][1]
来识别查询中的 sql 部分,然后编写一个函数来解析 table 连接的名称和列。
下面是一个示例查询:
join_query2 = json.dumps(parse('''select * from tbl d
inner join jointbl1 c
on d.visit_id = c.session_id
inner join jointbl2 b
on b.sv_id = c.sv_id'''))
join_query2 = json.loads(join_query2)
当 运行 通过 moz-sql-parser
时产生:
{'select': '*',
'from': [{'value': 'tbl', 'name': 'd'},
{'inner join': {'name': 'c',
'value': 'jointbl1'},
'on': {'eq': ['d.visit_id', 'c.session_id']}},
{'inner join': {'name': 'b',
'value': 'jointbl2'},
'on': {'eq': ['b.sv_id', 'c.sv_id']}}]}
现在我已经编写了可以解析出 table 名称和列名称的函数:
def parse_table_names_v2(result):
the_list = []
for x in result['from']:
try:
if 'value' in x: #returning just the main table_name
if 'name' in x:
the_list.append(x.get('name',None))
the_list.append(x.get('value'))
elif 'join' in x:
join = x['join']
if 'value' in join:
if 'name' in join:
the_list.append(join.get('name'))
the_list.append(join.get('value'))
elif 'inner join' in x:
inner_join = x['inner join']
if 'value' in inner_join:
if 'name' in inner_join:
the_list.append(inner_join.get('name'))
the_list.append(inner_join.get('value'))
except Exception as e:
print(e)
return the_list
def parse_column_names(result):
columns = []
for x in result['from']:
try:
if 'on' in x:
on = x['on']
if 'and' in on:
for x in on['and']:
if 'eq' in x:
columns.append(x['eq'])
elif 'and' not in on:
if 'eq' in on:
columns.append(on['eq'])
except Exception as e:
print(e)
return columns
它产生如下所示的 2 个列表:
['d',
'tbl1',
'c',
'jointbl1',
'b',
'jointbl2']
和
[['d.visit_id', 'c.session_id'], ['b.sv_id', 'c.sv_id']]
但这里的技巧是所需的输出看起来像
Row1 -> tbl1 visit_id jointbl1 session_id
Row2 -> jointbl1 sv_id jointbl2 sv_id
我的目标是解析类似的查询,我可以在其中将输出构建为 dataframe/list,但很难以这种特定方式输出解析。任何线索将不胜感激。
这对您正在尝试做的事情有用吗?
tables = ['d',
'tbl1',
'c',
'jointbl1',
'b',
'jointbl2']
columns = [['d.visit_id', 'c.session_id'], ['b.sv_id', 'c.sv_id']]
# Convert table list to a lookup table
lookup_table = {}
alias = ""
tablename = ""
for idx, item in enumerate(tables):
if idx % 2 != 1:
alias = item
else:
tablename = item
lookup_table[alias] = tablename
# Use the lookup table to build the new row format
new_rows = []
for row in columns:
new_row = []
for elem in row:
item = elem.split('.')
col_table = item[0]
column = item[1]
new_row.append(lookup_table[col_table])
new_row.append(column)
new_rows.append(new_row)
for row in new_rows:
print(" ".join(row))
输出:
tbl1 visit_id jointbl1 session_id
jointbl2 sv_id jointbl1 sv_id
我正在尝试解析 sql 查询。我正在使用 [moz-sql-parser][1]
来识别查询中的 sql 部分,然后编写一个函数来解析 table 连接的名称和列。
下面是一个示例查询:
join_query2 = json.dumps(parse('''select * from tbl d
inner join jointbl1 c
on d.visit_id = c.session_id
inner join jointbl2 b
on b.sv_id = c.sv_id'''))
join_query2 = json.loads(join_query2)
当 运行 通过 moz-sql-parser
时产生:
{'select': '*',
'from': [{'value': 'tbl', 'name': 'd'},
{'inner join': {'name': 'c',
'value': 'jointbl1'},
'on': {'eq': ['d.visit_id', 'c.session_id']}},
{'inner join': {'name': 'b',
'value': 'jointbl2'},
'on': {'eq': ['b.sv_id', 'c.sv_id']}}]}
现在我已经编写了可以解析出 table 名称和列名称的函数:
def parse_table_names_v2(result):
the_list = []
for x in result['from']:
try:
if 'value' in x: #returning just the main table_name
if 'name' in x:
the_list.append(x.get('name',None))
the_list.append(x.get('value'))
elif 'join' in x:
join = x['join']
if 'value' in join:
if 'name' in join:
the_list.append(join.get('name'))
the_list.append(join.get('value'))
elif 'inner join' in x:
inner_join = x['inner join']
if 'value' in inner_join:
if 'name' in inner_join:
the_list.append(inner_join.get('name'))
the_list.append(inner_join.get('value'))
except Exception as e:
print(e)
return the_list
def parse_column_names(result):
columns = []
for x in result['from']:
try:
if 'on' in x:
on = x['on']
if 'and' in on:
for x in on['and']:
if 'eq' in x:
columns.append(x['eq'])
elif 'and' not in on:
if 'eq' in on:
columns.append(on['eq'])
except Exception as e:
print(e)
return columns
它产生如下所示的 2 个列表:
['d',
'tbl1',
'c',
'jointbl1',
'b',
'jointbl2']
和
[['d.visit_id', 'c.session_id'], ['b.sv_id', 'c.sv_id']]
但这里的技巧是所需的输出看起来像
Row1 -> tbl1 visit_id jointbl1 session_id
Row2 -> jointbl1 sv_id jointbl2 sv_id
我的目标是解析类似的查询,我可以在其中将输出构建为 dataframe/list,但很难以这种特定方式输出解析。任何线索将不胜感激。
这对您正在尝试做的事情有用吗?
tables = ['d',
'tbl1',
'c',
'jointbl1',
'b',
'jointbl2']
columns = [['d.visit_id', 'c.session_id'], ['b.sv_id', 'c.sv_id']]
# Convert table list to a lookup table
lookup_table = {}
alias = ""
tablename = ""
for idx, item in enumerate(tables):
if idx % 2 != 1:
alias = item
else:
tablename = item
lookup_table[alias] = tablename
# Use the lookup table to build the new row format
new_rows = []
for row in columns:
new_row = []
for elem in row:
item = elem.split('.')
col_table = item[0]
column = item[1]
new_row.append(lookup_table[col_table])
new_row.append(column)
new_rows.append(new_row)
for row in new_rows:
print(" ".join(row))
输出:
tbl1 visit_id jointbl1 session_id
jointbl2 sv_id jointbl1 sv_id