SQL 服务器批量复制动态脚本
SQL Server Bulk Copy dynamic script
我有很多 CSV 文件要导入到 SQL 服务器数据库中。这些文件中的每一个都包含超过 5000 万行,我需要将这些行导入到一个名为 ControlExperimentTable
.
的 table 中
在接下来的几周内,总共有 500 个这样的文件将被发送给我以导入到数据库中,所以我写了一个脚本来自动化这个过程,这显然又长又麻烦。
简而言之,该过程的工作原理如下 -- CSV 文件被复制到存储库文件夹中进行处理。所有文件都具有相同的前缀和唯一的后缀,例如 ExportData0001.txt
、ExportData0002.txt
、ExportData0003.txt
、ExportData0004.txt
等
脚本按顺序处理每个文件,将其内容导入 SQL 服务器数据库。导入所有行后,将处理后的文件移至存档文件夹,进行完整的数据库备份,然后流程继续处理下一个文件。
以下是我用来完成任务的代码:
--Int variable declaration section.
DECLARE @totalFilesToProcess INT = 500 /*The number of files in the repository.*/
DECLARE @count INT = 0
--VarChar variable declaration section.
DECLARE @sqlBulkInsertCommand VARCHAR(255)
DECLARE @sqlMoveCommand VARCHAR(255)
DECLARE @sourceFilename VARCHAR(255)
DECLARE @errorFilename VARCHAR(255)
DECLARE @suffix VARCHAR(4)
--Ensure that the xp_cmdshell server configuration option is enabled.
EXEC master.dbo.sp_configure 'show advanced options', 1
RECONFIGURE
EXEC master.dbo.sp_configure 'xp_cmdshell', 1
RECONFIGURE
--Change to the target database instance.
USE [DataCollectionTable]
--Set the script to loop the number of times as there are as may files to import
WHILE (@count <= @totalFilesToProcess)
BEGIN
--Set variables.
SET @count = @count + 1
SET @suffix = RIGHT('0000' + CAST(@count AS VARCHAR(4)), 4)
SET @errorFilename = FORMATMESSAGE('E:\SharedDocs\ControlExportData%s.csv', @suffix)
SET @sourceFilename = FORMATMESSAGE('E:\SharedDocs\ControlExportData%s.txt', @suffix)
--COMMAND CONSTRUCT: Insert data from flat file into the ControlExperimentTable table.
SET @sqlBulkInsertCommand = FORMATMESSAGE('BULK INSERT ControlExperimentTable FROM ''%s'' WITH (FIRSTROW = 2, FIELDTERMINATOR = '','', ROWTERMINATOR = ''\n'', ERRORFILE = ''%s'', TABLOCK)', @sourceFilename, @errorFilename)
--COMMAND CONSTRUCT: Move the source file to the archive folder.
SET @sqlMoveCommand = FORMATMESSAGE('MOVE E:\SharedDocs\ControlExportData%s.txt E:\SharedDocs\Archive\ControlExportData%s.txt', @suffix, @suffix)
--Display record count before every update.
SELECT sys.sysindexes.rows FROM sys.sysindexes INNER JOIN sys.sysobjects
ON sys.sysobjects.id=sys.sysindexes.id
WHERE sys.sysindexes.first IS NOT NULL AND sys.sysobjects.name = 'ControlExperimentTable'
--Execute the BULK INSERT command.
EXEC @sqlBulkInsertCommand
--Execute the source file MOVE command.
EXEC master.dbo.xp_cmdshell @sqlMoveCommand
--Backup database after each commitment, maintaining the current and previous copies.
IF (@count % 2) = 0
BEGIN
BACKUP DATABASE [ExperimentDataCollection] TO DISK = N'E:\Backups\ExperimentDataCollectionBackup-01.bak' WITH NOFORMAT, INIT, NAME = N'ExperimentDataCollection-Full Database Backup', SKIP, NOREWIND, NOUNLOAD, STATS = 10
END ELSE
BEGIN
BACKUP DATABASE [ExperimentDataCollection] TO DISK = N'E:\Backups\ExperimentDataCollectionBackup-02.bak' WITH NOFORMAT, INIT, NAME = N'ExperimentDataCollection-Full Database Backup', SKIP, NOREWIND, NOUNLOAD, STATS = 10
END
END
在 运行 这段代码之后,我收到以下错误:
Configuration option 'show advanced options' changed from 1 to 1. Run the RECONFIGURE statement to install.
Configuration option 'xp_cmdshell' changed from 1 to 1. Run the RECONFIGURE statement to install.
Msg 911, Level 16, State 4, Line 48
Database 'BULK INSERT ControlExperimentTable FROM 'E:\SharedDocs\ExportData0001' does not exist. Make sure that the name is entered correctly.
Completion time: 2021-02-18T18:06:15.0003210+02:00
请注意,一切都在代码中正确指定的位置。如此之多以至于手动 运行 下面的代码,这正是我正在尝试解析的代码,可以毫不费力地工作。
USE [ExperimentDataCollection]
BULK INSERT [ExperimentDataCollection].[dbo].[ControlExperimentTable]
FROM 'E:\SharedDocs\ControlExportData0001.txt'
WITH (FIRSTROW = 2,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
ERRORFILE = 'E:\SharedDocs\ControlExportData0001.csv',
TABLOCK)
我哪里出错了?
执行动态sql时,需要将变量括在括号中。所以改变这个:
EXEC @sqlBulkInsertCommand
对此:
EXEC(@sqlBulkInsertCommand)
我有很多 CSV 文件要导入到 SQL 服务器数据库中。这些文件中的每一个都包含超过 5000 万行,我需要将这些行导入到一个名为 ControlExperimentTable
.
在接下来的几周内,总共有 500 个这样的文件将被发送给我以导入到数据库中,所以我写了一个脚本来自动化这个过程,这显然又长又麻烦。
简而言之,该过程的工作原理如下 -- CSV 文件被复制到存储库文件夹中进行处理。所有文件都具有相同的前缀和唯一的后缀,例如 ExportData0001.txt
、ExportData0002.txt
、ExportData0003.txt
、ExportData0004.txt
等
脚本按顺序处理每个文件,将其内容导入 SQL 服务器数据库。导入所有行后,将处理后的文件移至存档文件夹,进行完整的数据库备份,然后流程继续处理下一个文件。
以下是我用来完成任务的代码:
--Int variable declaration section.
DECLARE @totalFilesToProcess INT = 500 /*The number of files in the repository.*/
DECLARE @count INT = 0
--VarChar variable declaration section.
DECLARE @sqlBulkInsertCommand VARCHAR(255)
DECLARE @sqlMoveCommand VARCHAR(255)
DECLARE @sourceFilename VARCHAR(255)
DECLARE @errorFilename VARCHAR(255)
DECLARE @suffix VARCHAR(4)
--Ensure that the xp_cmdshell server configuration option is enabled.
EXEC master.dbo.sp_configure 'show advanced options', 1
RECONFIGURE
EXEC master.dbo.sp_configure 'xp_cmdshell', 1
RECONFIGURE
--Change to the target database instance.
USE [DataCollectionTable]
--Set the script to loop the number of times as there are as may files to import
WHILE (@count <= @totalFilesToProcess)
BEGIN
--Set variables.
SET @count = @count + 1
SET @suffix = RIGHT('0000' + CAST(@count AS VARCHAR(4)), 4)
SET @errorFilename = FORMATMESSAGE('E:\SharedDocs\ControlExportData%s.csv', @suffix)
SET @sourceFilename = FORMATMESSAGE('E:\SharedDocs\ControlExportData%s.txt', @suffix)
--COMMAND CONSTRUCT: Insert data from flat file into the ControlExperimentTable table.
SET @sqlBulkInsertCommand = FORMATMESSAGE('BULK INSERT ControlExperimentTable FROM ''%s'' WITH (FIRSTROW = 2, FIELDTERMINATOR = '','', ROWTERMINATOR = ''\n'', ERRORFILE = ''%s'', TABLOCK)', @sourceFilename, @errorFilename)
--COMMAND CONSTRUCT: Move the source file to the archive folder.
SET @sqlMoveCommand = FORMATMESSAGE('MOVE E:\SharedDocs\ControlExportData%s.txt E:\SharedDocs\Archive\ControlExportData%s.txt', @suffix, @suffix)
--Display record count before every update.
SELECT sys.sysindexes.rows FROM sys.sysindexes INNER JOIN sys.sysobjects
ON sys.sysobjects.id=sys.sysindexes.id
WHERE sys.sysindexes.first IS NOT NULL AND sys.sysobjects.name = 'ControlExperimentTable'
--Execute the BULK INSERT command.
EXEC @sqlBulkInsertCommand
--Execute the source file MOVE command.
EXEC master.dbo.xp_cmdshell @sqlMoveCommand
--Backup database after each commitment, maintaining the current and previous copies.
IF (@count % 2) = 0
BEGIN
BACKUP DATABASE [ExperimentDataCollection] TO DISK = N'E:\Backups\ExperimentDataCollectionBackup-01.bak' WITH NOFORMAT, INIT, NAME = N'ExperimentDataCollection-Full Database Backup', SKIP, NOREWIND, NOUNLOAD, STATS = 10
END ELSE
BEGIN
BACKUP DATABASE [ExperimentDataCollection] TO DISK = N'E:\Backups\ExperimentDataCollectionBackup-02.bak' WITH NOFORMAT, INIT, NAME = N'ExperimentDataCollection-Full Database Backup', SKIP, NOREWIND, NOUNLOAD, STATS = 10
END
END
在 运行 这段代码之后,我收到以下错误:
Configuration option 'show advanced options' changed from 1 to 1. Run the RECONFIGURE statement to install.
Configuration option 'xp_cmdshell' changed from 1 to 1. Run the RECONFIGURE statement to install.Msg 911, Level 16, State 4, Line 48
Database 'BULK INSERT ControlExperimentTable FROM 'E:\SharedDocs\ExportData0001' does not exist. Make sure that the name is entered correctly.Completion time: 2021-02-18T18:06:15.0003210+02:00
请注意,一切都在代码中正确指定的位置。如此之多以至于手动 运行 下面的代码,这正是我正在尝试解析的代码,可以毫不费力地工作。
USE [ExperimentDataCollection]
BULK INSERT [ExperimentDataCollection].[dbo].[ControlExperimentTable]
FROM 'E:\SharedDocs\ControlExportData0001.txt'
WITH (FIRSTROW = 2,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
ERRORFILE = 'E:\SharedDocs\ControlExportData0001.csv',
TABLOCK)
我哪里出错了?
执行动态sql时,需要将变量括在括号中。所以改变这个:
EXEC @sqlBulkInsertCommand
对此:
EXEC(@sqlBulkInsertCommand)