从文本文件中查询数据并得到 JSON 列

Query data from a text file and get a JSON column

使用 SQL Server 2019 Express Edition。

我有一个这样的文本文件:

/type/author    /authors/OL1002354A 2   2008-08-20T18:07:53.62084   {"name": "Don L. Brigham", "personal_name": "Don L. Brigham", "last_modified": {"type": "/type/datetime", "value": "2008-08-20T18:07:53.62084"}, "key": "/authors/OL1002354A", "type": {"key": "/type/author"}, "revision": 2}
/type/author    /authors/OL100246A  1   2008-04-01T03:28:50.625462  {"name": "Talib Samat.", "personal_name": "Talib Samat.", "last_modified": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"}, "key": "/authors/OL100246A", "type": {"key": "/type/author"}, "revision": 1}
/type/author    /authors/OL1002700A 1   2008-04-01T03:28:50.625462  {"name": "Bengt E. Gustafsson Symposium (5th 1988 Stockholm, Sweden)", "last_modified": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"}, "key": "/authors/OL1002700A", "type": {"key": "/type/author"}, "revision": 1}
/type/author    /authors/OL1002807A 2   2008-08-20T18:12:02.683498  {"name": "Ary J. Lamme", "personal_name": "Ary J. Lamme", "last_modified": {"type": "/type/datetime", "value": "2008-08-20T18:12:02.683498"}, "key": "/authors/OL1002807A", "birth_date": "1940", "type": {"key": "/type/author"}, "revision": 2}
/type/author    /authors/OL1002994A 5   2012-03-03T06:50:39.836886  {"name": "R. Baxter Miller", "personal_name": "R. Baxter Miller", "created": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"}, "photos": [7075806, 6974916], "last_modified": {"type": "/type/datetime", "value": "2012-03-03T06:50:39.836886"}, "latest_revision": 5, "key": "/authors/OL1002994A", "type": {"key": "/type/author"}, "revision": 5}
/type/author    /authors/OL100301A  1   2008-04-01T03:28:50.625462  {"name": "Ghazali Basri.", "personal_name": "Ghazali Basri.", "last_modified": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"}, "key": "/authors/OL100301A", "type": {"key": "/type/author"}, "revision": 1}
/type/author    /authors/OL1003201A 2   2008-08-20T18:14:55.775993  {"name": "Robert Smaus", "personal_name": "Robert Smaus", "last_modified": {"type": "/type/datetime", "value": "2008-08-20T18:14:55.775993"}, "key": "/authors/OL1003201A", "type": {"key": "/type/author"}, "revision": 2}
/type/author    /authors/OL1003202A 2   2008-08-20T18:14:56.005766  {"name": "Richard Mark Friedhoff", "personal_name": "Richard Mark Friedhoff", "last_modified": {"type": "/type/datetime", "value": "2008-08-20T18:14:56.005766"}, "key": "/authors/OL1003202A", "type": {"key": "/type/author"}, "revision": 2}
/type/author    /authors/OL1003235A 1   2008-04-01T03:28:50.625462  {"name": "Hunbatz Men", "personal_name": "Hunbatz Men", "last_modified": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"}, "key": "/authors/OL1003235A", "birth_date": "1941", "type": {"key": "/type/author"}, "revision": 1}
/type/author    /authors/OL1003719A 1   2008-04-01T03:28:50.625462  {"name": "NATO Advanced Research Workshop on Ras Oncogenes (1988 Athens, Greece)", "last_modified": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"}, "key": "/authors/OL1003719A", "type": {"key": "/type/author"}, "revision": 1}
/type/author    /authors/OL1003744A 2   2008-08-20T18:20:16.351762  {"name": "Jeanne Thieme", "personal_name": "Jeanne Thieme", "last_modified": {"type": "/type/datetime", "value": "2008-08-20T18:20:16.351762"}, "key": "/authors/OL1003744A", "type": {"key": "/type/author"}, "revision": 2}
/type/author    /authors/OL1003901A 2   2008-08-20T18:21:31.331678  {"name": "Kiiti Morita", "personal_name": "Kiiti Morita", "last_modified": {"type": "/type/datetime", "value": "2008-08-20T18:21:31.331678"}, "key": "/authors/OL1003901A", "birth_date": "1915", "type": {"key": "/type/author"}, "revision": 2}
/type/author    /authors/OL1004047A 1   2008-04-01T03:28:50.625462  {"name": "Murphy, William M.", "personal_name": "Murphy, William M.", "last_modified": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"}, "key": "/authors/OL1004047A", "birth_date": "1942", "type": {"key": "/type/author"}, "revision": 1}

列由制表分隔,行由换行分隔。

我需要获取属于 JSON 结构的第 4 列中的数据。例如我需要所有“名称”属性的值。

我已经使用 SSIS 将数据导入 table,然后我可以 CROSS APPLY OPENJSON(json_column) 获取键和值。但我想知道这是否不能单独使用 SQL/TSQL 来完成,直接使用 OPENROWSET 并仅使用 JSON 中格式化的列。尝试将 OPENROWSETCROSS APPLY OPENJSON(BulkColumn) 一起使用,但无法完成,因为其余列未 JSON 格式化。

关于如何避免此错误或其他方法的任何想法?

您可以使用 BULK INSERT 将文件放入临时 table 并将其解析为制表符分隔文件。然后使用 OPENJSON 获取 JSON 数据。以下对我有用:

DROP TABLE IF EXISTS #Temp;
CREATE TABLE #Temp (
    /* Just some random column names*/
    Author NVARCHAR(100),
    AuthorPath NVARCHAR(100),
    IntValue INT,
    Created DATETIME2(3),
    JsonData NVARCHAR(MAX)
);

BULK INSERT #Temp
FROM 'C:\Users\andre\Documents\temp\test.txt'
WITH (
    FIELDTERMINATOR = '\t', --Tab delimited
    ROWTERMINATOR = '\n' --New-line character for row termination
)

SELECT 
    Temp.*,
    JsonData.[name]
FROM #Temp Temp
CROSS APPLY OPENJSON(Temp.JsonData,'$') 
WITH(
    [name] NVARCHAR(200) '$.name'
) JsonData