SQL 服务器批量插入不将双引号识别为 fieldquote

SQL Server bulk insert doesn't recognize double quotes as fieldquote

我正在尝试在 SQL Server 2017 14.0.1000.169 中批量插入文件。我想在文件到达时完全获取文件,将其保存到所需位置,然后 运行 批量插入查询 根本不需要修改文件 。我很难让脚本识别并忽略文本文件中的双引号,除非我手动将行尾从 Unix 更改为 Windows。我在这里和 SO 之外阅读了很多主题,讨论与这个主题接近的主题,唉,none 其中给了我问题的答案:

如何批量插入带有 Unix 行结尾的文件而不是双引号结尾?

我的 文件 看起来像这样:

"Report Name","Daily Extract (ID: 111111)"
"Date/Time Generated(UTC)","01-Mar-2020 15:08:51"
"Workspace Name","Company (ID: 22222)"
"Account Name","Client Account"
"Date Range","01-Jan-2019 - 29-Feb-2020"

"Dimension 1","Dimension 2","Dimension 3","Dimension 4","Dimension 5","Dimension 6","Dimension 7","Dimension 8","Dimension 9","Dimension 10","Dimension 11","Dimension 12","Dimension 13","Dimension 14","Dimension 15","Dimension 16","Dimension 17","Metric 1","Metric 2","Metric 3","Metric 4","Metric 5","Metric 6","Metric 7","Metric 8","Metric 9","Metric 10","Metric 11","Metric 12"
"string","string","date as string","string","string","string","string","string","string","string","string","string","string","string","string","string","string","bigint","bigint","decimal","decimal","decimal","bigint","decimal","decimal","bigint","decimal","bigint","bigint"

我使用的查询如下:

DROP TABLE IF EXISTS Table
GO

CREATE TABLE [dbo].[Table](
    [Dimension 1] [varchar] (255) NULL,
    [Dimension 2] [varchar] (255) NULL,
    [Dimension 3] [varchar] (255) NULL,
    [Dimension 4] [varchar]  (255) NULL,
    [Dimension 5] [varchar] (255),
    [Dimension 6] [varchar] (255) NULL,
    [Dimension 7] [varchar] (255) NULL,
    [Dimension 8] [varchar] (255) NULL,
    [Dimension 9] [varchar] (1000) NULL,
    [Dimension 10] [varchar] (255) NULL,
    [Dimension 11] [varchar] (255) NULL,
    [Dimension 12] [varchar] (255) NULL,
    [Dimension 13] [varchar] (1000) NULL,
    [Dimension 14] [varchar] (1000) NULL,
    [Dimension 15] [varchar] (1000) NULL,
    [Dimension 16] [varchar] (1000) NULL,
    [Dimension 17] [varchar] (1000) NULL,
    [Metric 1] [varchar] (50) NULL,
    [Metric 2] [varchar] (50) NULL,
    [Metric 3] [varchar] (50) NULL,
    [Metric 4] [varchar] (50) NULL,
    [Metric 5] [varchar] (50) NULL,
    [Metric 6] [varchar] (50) NULL,
    [Metric 7] [varchar] (50) NULL,
    [Metric 8] [varchar] (50) NULL,
    [Metric 9] [varchar] (50) NULL,
    [Metric 10] [varchar] (255) NULL,
    [Metric 11] [varchar] (50) NULL,
    [Metric 12] [varchar] (50) NULL
) ON [PRIMARY]
GO

BULK
INSERT Table
FROM 'C:\Users\username\Folder\File.csv'
WITH
(
--FORMAT = 'CSV',
DATAFILETYPE = 'char',
FIELDTERMINATOR = ',',
--ROWTERMINATOR = '\n',
ROWTERMINATOR = '0x0a',
FIRSTROW = 7,
--FIELDQUOTE = '"'
FIELDQUOTE = '0x22'
)
;

正如您在上面看到的,我将所有内容都导入为 varchar。最初我只将它用于一个指标(由于供应端的数据质量问题),因为我完全打算在文件加载后纠正每一个缺陷。 运行 遇到困难,但我已将所有指标设置为 varchar,因此至少文件会加载,我可以看到它的样子并进一步挖掘。

到目前为止,我已尝试以下方法:

到目前为止我尝试过的所有其他事情都导致了各种错误,这些错误都导致了同样的两件事:要么我不能使用 FORMAT = 'CSV' (如果我将 Unix 行结尾留在),或者当我尝试将指标加载为浮点数时,它会因为双引号而出错。

我暂时有一个解决方法(我可以删除双引号并在加载后转换字段),但我想知道我是否可以将该步骤集成到批量插入中(就像我所做的那样)当我加载带有 Windows 结尾的文件时)。

N.B。我知道 FIELDQUOTE 已经存在太久了,但是,根据 Microsoft,它应该适用于我的构建:

"FIELDQUOTE = 'field_quote' Applies to: SQL Server 2017 (14.x) CTP 1.1. Specifies a character that will be used as the quote character in the CSV file. If not specified, the quote character (") will be used as the quote character as defined in the RFC 4180 standard."

我是不是忘了透露什么?如果没有,有什么我可能忽略的想法吗?

提前致谢!

好的。这里最大的问题是你的文件。首先,由于顶部的行,文件没有 RFC 4180。这让人头疼。

接下来是关于FIRSTROW的重要警告:

When skipping rows, the SQL Server Database Engine looks only at the field terminators, and does not validate the data in the fields of skipped rows.

注意这里说的是 字段终止符 不是 行终止符。这是第二个问题。对于您的数据,开头是这样的:

"Report Name","Daily Extract (ID: 111111)"
"Date/Time Generated(UTC)","01-Mar-2020 15:08:51"
"Workspace Name","Company (ID: 22222)"
"Account Name","Client Account"
"Date Range","01-Jan-2019 - 29-Feb-2020"
<-- Blank Line -->

这是 6 个字段终止符和 6 个行终止符。

接下来,CSV 文件中的列比 table Table 列。 Table 没有列 Dimension 17

添加这个缺失的列后,我设法让这个工作得到我相信你想要的结果:

BULK INSERT [Table]
FROM '/tmp/YourFile2.txt'
WITH (FIELDTERMINATOR = ',',
      ROWTERMINATOR = '\n',
      FIRSTROW = 2,
      FORMAT = 'CSV',
      FIELDQUOTE = '"');

这在 table 中插入了 1 行。