SQL 服务器批量复制动态脚本

SQL Server Bulk Copy dynamic script

我有很多 CSV 文件要导入到 SQL 服务器数据库中。这些文件中的每一个都包含超过 5000 万行,我需要将这些行导入到一个名为 ControlExperimentTable.

的 table 中

在接下来的几周内,总共有 500 个这样的文件将被发送给我以导入到数据库中,所以我写了一个脚本来自动化这个过程,这显然又长又麻烦。

简而言之,该过程的工作原理如下 -- CSV 文件被复制到存储库文件夹中进行处理。所有文件都具有相同的前缀和唯一的后缀,例如 ExportData0001.txtExportData0002.txtExportData0003.txtExportData0004.txt

脚本按顺序处理每个文件,将其内容导入 SQL 服务器数据库。导入所有行后,将处理后的文件移至存档文件夹,进行完整的数据库备份,然后流程继续处理下一个文件。

以下是我用来完成任务的代码:

--Int variable declaration section.
DECLARE @totalFilesToProcess INT = 500  /*The number of files in the repository.*/
DECLARE @count INT = 0

--VarChar variable declaration section.
DECLARE @sqlBulkInsertCommand VARCHAR(255)
DECLARE @sqlMoveCommand VARCHAR(255)
DECLARE @sourceFilename VARCHAR(255)
DECLARE @errorFilename VARCHAR(255)
DECLARE @suffix VARCHAR(4)

--Ensure that the xp_cmdshell server configuration option is enabled.
EXEC master.dbo.sp_configure 'show advanced options', 1
RECONFIGURE

EXEC master.dbo.sp_configure 'xp_cmdshell', 1
RECONFIGURE

--Change to the target database instance.
USE [DataCollectionTable]

--Set the script to loop the number of times as there are as may files to import
WHILE (@count <= @totalFilesToProcess)
BEGIN
    --Set variables.
    SET @count = @count + 1
    SET @suffix = RIGHT('0000' + CAST(@count AS VARCHAR(4)), 4)
    SET @errorFilename = FORMATMESSAGE('E:\SharedDocs\ControlExportData%s.csv', @suffix)
    SET @sourceFilename = FORMATMESSAGE('E:\SharedDocs\ControlExportData%s.txt', @suffix)
    
    --COMMAND CONSTRUCT: Insert data from flat file into the ControlExperimentTable table.
    SET @sqlBulkInsertCommand = FORMATMESSAGE('BULK INSERT ControlExperimentTable FROM ''%s'' WITH (FIRSTROW = 2,   FIELDTERMINATOR = '','', ROWTERMINATOR = ''\n'', ERRORFILE = ''%s'', TABLOCK)', @sourceFilename, @errorFilename)
    
    --COMMAND CONSTRUCT: Move the source file to the archive folder.
    SET @sqlMoveCommand = FORMATMESSAGE('MOVE E:\SharedDocs\ControlExportData%s.txt E:\SharedDocs\Archive\ControlExportData%s.txt', @suffix, @suffix)
    
    --Display record count before every update.
    SELECT sys.sysindexes.rows FROM sys.sysindexes INNER JOIN sys.sysobjects 
    ON sys.sysobjects.id=sys.sysindexes.id 
    WHERE sys.sysindexes.first IS NOT NULL AND sys.sysobjects.name = 'ControlExperimentTable'
    
    --Execute the BULK INSERT command.
    EXEC @sqlBulkInsertCommand

    --Execute the source file MOVE command.
    EXEC master.dbo.xp_cmdshell @sqlMoveCommand 
    
    --Backup database after each commitment, maintaining the current and previous copies.
    IF (@count % 2)  = 0
    BEGIN
        BACKUP DATABASE [ExperimentDataCollection] TO DISK = N'E:\Backups\ExperimentDataCollectionBackup-01.bak' WITH NOFORMAT, INIT, NAME = N'ExperimentDataCollection-Full Database Backup', SKIP, NOREWIND, NOUNLOAD, STATS = 10
    END ELSE
    BEGIN 
        BACKUP DATABASE [ExperimentDataCollection] TO DISK = N'E:\Backups\ExperimentDataCollectionBackup-02.bak' WITH NOFORMAT, INIT, NAME = N'ExperimentDataCollection-Full Database Backup', SKIP, NOREWIND, NOUNLOAD, STATS = 10
    END
END

在 运行 这段代码之后,我收到以下错误:

Configuration option 'show advanced options' changed from 1 to 1. Run the RECONFIGURE statement to install.
Configuration option 'xp_cmdshell' changed from 1 to 1. Run the RECONFIGURE statement to install.

Msg 911, Level 16, State 4, Line 48
Database 'BULK INSERT ControlExperimentTable FROM 'E:\SharedDocs\ExportData0001' does not exist. Make sure that the name is entered correctly.

Completion time: 2021-02-18T18:06:15.0003210+02:00

请注意,一切都在代码中正确指定的位置。如此之多以至于手动 运行 下面的代码,这正是我正在尝试解析的代码,可以毫不费力地工作。

USE [ExperimentDataCollection]

BULK INSERT [ExperimentDataCollection].[dbo].[ControlExperimentTable] 
FROM 'E:\SharedDocs\ControlExportData0001.txt' 
WITH (FIRSTROW = 2, 
      FIELDTERMINATOR = ',', 
      ROWTERMINATOR = '\n', 
      ERRORFILE = 'E:\SharedDocs\ControlExportData0001.csv', 
      TABLOCK)

我哪里出错了?

执行动态sql时,需要将变量括在括号中。所以改变这个:

EXEC @sqlBulkInsertCommand

对此:

EXEC(@sqlBulkInsertCommand)