Azure Synapse 中第 1 行第 4 列的批量加载数据转换错误(类型不匹配或指定代码页的字符无效)
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 4 in Azure Synapse
我的 Azure Data Lake 中有一个 Spotify CSV 文件。我正在尝试在 Azure Synapse 中创建外部 table 你 SQL 无服务器池。
我收到以下错误消息
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 4 (Track_popularity) in data file https://test.dfs.core.windows.net/data/folder/updated.csv.
我正在使用下面的脚本
IF NOT EXISTS (SELECT * FROM sys.external_file_formats WHERE name = 'SynapseDelimitedTextFormat')
CREATE EXTERNAL FILE FORMAT [SynapseDelimitedTextFormat]
WITH ( FORMAT_TYPE = DELIMITEDTEXT ,
FORMAT_OPTIONS (
FIELD_TERMINATOR = ',',
USE_TYPE_DEFAULT = FALSE
))
GO
IF NOT EXISTS (SELECT * FROM sys.external_data_sources WHERE name = 'test.dfs.core.windows.net')
CREATE EXTERNAL DATA SOURCE [test.dfs.core.windows.net]
WITH (
LOCATION = 'abfss://data@test.dfs.core.windows.net'
)
GO
CREATE EXTERNAL TABLE updated (
[Artist] nvarchar(4000),
[Track] nvarchar(4000),
[Track_id] nvarchar(4000),
[Track_popularity] bigint,
[Artist_id] nvarchar(4000),
[Artist_Popularity] bigint,
[Genres] nvarchar(4000),
[Followers] bigint,
[danceability] float,
[energy] float,
[key] bigint,
[loudness] float,
[mode] bigint,
[speechiness] float,
[acousticness] float,
[instrumentalness] float,
[liveness] float,
[valence] float,
[tempo] float,
[duration_ms] bigint,
[time_signature] bigint
)
WITH (
LOCATION = 'data/updated.csv',
DATA_SOURCE = [data_test_dfs_core_windows_net],
FILE_FORMAT = [SynapseDelimitedTextFormat]
)
GO
SELECT TOP 100 * FROM dbo.updated
GO
以下是数据样本
我的 CSV 是 utf-8 编码。不确定是什么问题。错误显示列 (Track_popularity)。请指教
我猜您可能有 header 行应该跳过。删除外部文件 table,然后删除并重新创建外部文件格式,如下所示:
CREATE EXTERNAL FILE FORMAT [SynapseDelimitedTextFormat]
WITH ( FORMAT_TYPE = DELIMITEDTEXT ,
FORMAT_OPTIONS (
FIELD_TERMINATOR = ',',
USE_TYPE_DEFAULT = FALSE,
FIRST_ROW = 2
))
我的 Azure Data Lake 中有一个 Spotify CSV 文件。我正在尝试在 Azure Synapse 中创建外部 table 你 SQL 无服务器池。
我收到以下错误消息
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 4 (Track_popularity) in data file https://test.dfs.core.windows.net/data/folder/updated.csv.
我正在使用下面的脚本
IF NOT EXISTS (SELECT * FROM sys.external_file_formats WHERE name = 'SynapseDelimitedTextFormat')
CREATE EXTERNAL FILE FORMAT [SynapseDelimitedTextFormat]
WITH ( FORMAT_TYPE = DELIMITEDTEXT ,
FORMAT_OPTIONS (
FIELD_TERMINATOR = ',',
USE_TYPE_DEFAULT = FALSE
))
GO
IF NOT EXISTS (SELECT * FROM sys.external_data_sources WHERE name = 'test.dfs.core.windows.net')
CREATE EXTERNAL DATA SOURCE [test.dfs.core.windows.net]
WITH (
LOCATION = 'abfss://data@test.dfs.core.windows.net'
)
GO
CREATE EXTERNAL TABLE updated (
[Artist] nvarchar(4000),
[Track] nvarchar(4000),
[Track_id] nvarchar(4000),
[Track_popularity] bigint,
[Artist_id] nvarchar(4000),
[Artist_Popularity] bigint,
[Genres] nvarchar(4000),
[Followers] bigint,
[danceability] float,
[energy] float,
[key] bigint,
[loudness] float,
[mode] bigint,
[speechiness] float,
[acousticness] float,
[instrumentalness] float,
[liveness] float,
[valence] float,
[tempo] float,
[duration_ms] bigint,
[time_signature] bigint
)
WITH (
LOCATION = 'data/updated.csv',
DATA_SOURCE = [data_test_dfs_core_windows_net],
FILE_FORMAT = [SynapseDelimitedTextFormat]
)
GO
SELECT TOP 100 * FROM dbo.updated
GO
以下是数据样本
我的 CSV 是 utf-8 编码。不确定是什么问题。错误显示列 (Track_popularity)。请指教
我猜您可能有 header 行应该跳过。删除外部文件 table,然后删除并重新创建外部文件格式,如下所示:
CREATE EXTERNAL FILE FORMAT [SynapseDelimitedTextFormat]
WITH ( FORMAT_TYPE = DELIMITEDTEXT ,
FORMAT_OPTIONS (
FIELD_TERMINATOR = ',',
USE_TYPE_DEFAULT = FALSE,
FIRST_ROW = 2
))