使用 JSON 类型从 JSON 列中提取所有记录

Extract all records from a JSON column, using JSON type

我有几个表(见底部的可重现代码):

tbl1_have

id  json_col                             
1   {"a_i":"a","a_j":1}
1   {"a_i":"b","a_j":2}
2   {"a_i":"c","a_j":3}
2   {"a_i":"d","a_j":4}

tbl2_have

id  json_col                          
1   [{"a_i":"a","a_j":1},{"a_i":"b","a_j":2}]
2   [{"a_i":"c","a_j":3},{"a_i":"d","a_j":4}]

我希望提取所有 json 列而不为每列提供显式数据类型转换,因为在我的用例中,嵌套属性的名称和数量各不相同。

两种情况的预期输出相同:

tbl_want

id  a_i a_j                             
1   a   1
1   b   2
2   c   3
2   d   4

a_ia_j 正确存储为字符和数字列,这意味着我想将 json 类型映射到 SQL 类型(比如 INTVARCHAR() )自动。

以下内容让我完成了两个表的一半:

SELECT id, a_i, a_j FROM tbl2_have CROSS APPLY OPENJSON(json_col) 
WITH(a_i VARCHAR(100), a_j INT)

  id a_i a_j
1  1   a   1
2  1   b   2
3  2   c   3
4  2   d   4

如何解决在 with() 中明确提及类型的问题?


可重现代码:

CREATE TABLE tbl1_have (id INT, json_col VARCHAR(100))
INSERT INTO tbl1_have VALUES 
(1,   '{"a_i":"a","a_j":1}'),
(1,   '{"a_i":"b","a_j":2}'),
(2,   '{"a_i":"c","a_j":3}'),
(2,   '{"a_i":"d","a_j":4}')

CREATE TABLE tbl2_have (id INT, json_col VARCHAR(100))
INSERT INTO tbl2_have VALUES 
(1,   '[{"a_i":"a","a_j":1},{"a_i":"b","a_j":2}]'),
(2,   '[{"a_i":"c","a_j":3},{"a_i":"d","a_j":4}]')

SELECT id, a_i, a_j FROM tbl1_have CROSS APPLY OPENJSON(json_col) 
WITH(a_i VARCHAR(100), a_j INT)

SELECT id, a_i, a_j FROM tbl2_have CROSS APPLY OPENJSON(json_col) 
WITH(a_i VARCHAR(100), a_j INT)

使用从 OPENJSON 返回的值是否可行?它可能映射到字符串数据类型,但是,您不必事先知道类型。 OPENJSON 行集函数的 official doc 表示它 returns 一个 Key:Value 对以及每个解析的类型。 Type 值可能有用,但是,它在解析时确定数据类型。我敢打赌 Value 始终是字符串类型,因为它必须是。

;WITH X AS
(
    SELECT id, a_i=J.[Key], a_j=J.[Value]  FROM #tbl2_have CROSS APPLY OPENJSON(json_col) J
)

SELECT 
    id, 
    a_i=MAX(CASE WHEN J.[Key]='a_i' THEN J.[Value] ELSE NULL END), 
    a_j=MAX(CASE WHEN J.[Key]='a_j' THEN J.[Value] ELSE NULL END) 
FROM X CROSS APPLY OPENJSON(X.a_j) J
GROUP BY
    id,a_i,a_j

我找到了一个可能适用于您的用例的解决方案。我无论如何都不是 SQL-expert,而且我没有设法自动检测动态列的数据类型。但我为您的两个示例找到了解决方案。

首先,我尝试从 json_col 中动态获取所有列名。我找到了一个 answer on Whosebug 并得到了这段代码:

STUFF(
    (
        SELECT DISTINCT ', '+QUOTENAME(columnname) FROM #tmpTbl FOR XML PATH(''), TYPE
    ).value('.', 'nvarchar(max)'), 1, 1, '');

这会将所有列名输出为以逗号分隔的字符串,在您的示例中:' [a_i], [a_j]'。然后可以将其用于动态 SELECT 列。

如上所述,我无法编写数据类型检测算法。我只是将列硬编码为 nvarchar(100) 作为数据类型。

为了动态获取具有相应数据类型的 column-names(硬编码为 nvarchar(100)),我使用了上述查询的略微修改版本:

STUFF(
    (
        SELECT DISTINCT ', '+QUOTENAME(columnname)+' nvarchar(100)' FROM #tmpTbl FOR XML PATH(''), TYPE
    ).value('.', 'nvarchar(max)'), 1, 1, '');

然后我就在 WITH-CLAUSE 中使用了它们。


完整版 table tbl1_have

DECLARE @cols NVARCHAR(MAX), @colsWithType NVARCHAR(MAX), @query NVARCHAR(MAX);
DROP TABLE IF EXISTS  #tmpTbl

SELECT outerTable.[id] AS columnid, innerTable.[key] AS columnname, innerTable.[value] AS columnvalue
    INTO #tmpTbl
    FROM tbl1_have outerTable CROSS APPLY OPENJSON(json_col) AS innerTable


SELECT * FROM #tmpTbl
SET @cols = STUFF(
    (
        SELECT DISTINCT ', '+QUOTENAME(columnname) FROM #tmpTbl FOR XML PATH(''), TYPE
    ).value('.', 'nvarchar(max)'), 1, 1, '');
SET @colsWithType = STUFF(
    (
        SELECT DISTINCT ', '+QUOTENAME(columnname)+' nvarchar(100)' FROM #tmpTbl FOR XML PATH(''), TYPE
    ).value('.', 'nvarchar(max)'), 1, 1, '');

SET @query = N'SELECT id, '+@cols+' FROM tbl1_have CROSS APPLY OPENJSON(json_col) 
WITH('+@colsWithType+')';

exec sp_executesql @query

完整版 table tbl2_have:

DECLARE @cols NVARCHAR(MAX), @colsWithType NVARCHAR(MAX), @query NVARCHAR(MAX);
DROP TABLE IF EXISTS  #tmpTbl
DROP TABLE IF EXISTS  #tmpTbl2
SELECT *
    INTO #tmpTbl
    FROM tbl2_have CROSS APPLY OPENJSON(json_col)

SELECT outerTable.[id] AS columnid, innerTable.[key] AS columnname, innerTable.[value] AS columnvalue
    INTO #tmpTbl2
    FROM #tmpTbl outerTable CROSS APPLY OPENJSON([value]) AS innerTable

SELECT * FROM #tmpTbl
SELECT * FROM #tmpTbl2
SET @cols = STUFF(
    (
        SELECT DISTINCT ', '+QUOTENAME(columnname) FROM #tmpTbl2 FOR XML PATH(''), TYPE
    ).value('.', 'nvarchar(max)'), 1, 1, '');
SET @colsWithType = STUFF(
    (
        SELECT DISTINCT ', '+QUOTENAME(columnname)+' nvarchar(100)' FROM #tmpTbl2 FOR XML PATH(''), TYPE
    ).value('.', 'nvarchar(max)'), 1, 1, '');

SET @query = N'SELECT id, '+@cols+' FROM tbl2_have CROSS APPLY OPENJSON(json_col) 
WITH('+@colsWithType+')';

exec sp_executesql @query

我假设您事先不知道密钥的名称和类型。您需要使用动态 SQL.

您首先需要在 {objects} 上使用不带 WITH 子句的 OPENJSON,如下所示:

select string_agg(quotename(k) + case t
    when 0 then ' nchar(1)'       -- javascript null
    when 1 then ' nvarchar(max)'  -- javascript string
    when 2 then ' float'          -- javascript number
    when 3 then ' bit'            -- javascript boolean
    else ' nvarchar(max) as json' -- javascript array or object
end, ', ') within group (order by k)
from (
    select j2.[key], max(j2.[type])
    from test
    cross apply openjson(case when json_col like '{%}' then '[' + json_col + ']' else json_col end) as j1
    cross apply openjson(j1.value) as j2
    group by j2.[key]
) as kt(k, t)

内部查询为您提供 table 中所有 json 值的所有键的名称和类型。外部查询为动态 SQL.

构建 WITH 子句

剩下的就比较简单了,在你的动态 SQL 中使用生成的子句。这是完整的示例:

declare @table_name nvarchar(100) = 'test';
declare @with_clause nvarchar(100);

declare @query1 nvarchar(999) = N'select @with_clause_temp = string_agg(quotename(k) + case t
    when 0 then '' nchar(1)''
    when 1 then '' nvarchar(max)''
    when 2 then '' float''
    when 3 then '' bit''
    else '' nvarchar(max) as json''
end, '', '') within group (order by k)
from (
    select j2.[key], max(j2.[type])
    from ' + quotename(@table_name) + '
    cross apply openjson(case when json_col like ''{%}'' then ''['' + json_col + '']'' else json_col end) as j1
    cross apply openjson(j1.value) as j2
    group by j2.[key]
) as kt(k, t)';
exec sp_executesql @query1, N'@with_clause_temp nvarchar(100) out', @with_clause out;

declare @query2 nvarchar(999) = N'select id, j.*
from ' + quotename(@table_name) + '
cross apply openjson(json_col)
with (' + @with_clause + ') as j';
exec sp_executesql @query2;

Demo on db<>fiddle