Polybase - 将数据类型 VARCHAR 转换为 DATETIME 时出错
Polybase - Error converting data type VARCHAR to DATETIME
我正在尝试为存储在 Azure 存储中的 CSV 文件创建外部 table。
CSV 数据如下所示:-
Date
Rail Period
Calendar Year
Calendar Month
Calendar Month Name
Fiscal Year
Fiscal Period
Weekday
Weekday Number
26/04/2021
2201
2021
4
April
2022
Period 1
Monday
1
27/04/2021
2201
2021
4
April
2022
Period 1
Tuesday
2
28/04/2021
2201
2021
4
April
2022
Period 1
Wednesday
3
29/04/2021
2201
2021
4
April
2022
Period 1
Thursday
4
30/04/2021
2201
2021
4
April
2022
Period 1
Friday
5
01/05/2021
2201
2021
5
May
2022
Period 2
Saturday
6
02/05/2021
2202
2021
5
May
2022
Period 2
Sunday
7
03/05/2021
2202
2021
5
May
2022
Period 2
Monday
1
04/05/2021
2202
2021
5
May
2022
Period 2
Tuesday
2
我使用以下代码创建了外部文件格式
CREATE EXTERNAL FILE FORMAT csvFile
WITH (
FORMAT_TYPE = DELIMITEDTEXT,
FORMAT_OPTIONS (
FIELD_TERMINATOR = ',',
STRING_DELIMITER = '"',
FIRST_ROW = 2,
USE_TYPE_DEFAULT = TRUE,
ENCODING = 'UTF8' )
);
和外部table如下
CREATE EXTERNAL TABLE ext.DateDimension(
[Date] DATE,
[Rail Period] INT,
[Calendar Year] INT,
[Calendar Month] INT,
[Calendar Month Name] VARCHAR(9),
[Fiscal Year] INT,
[Fiscal Period] VARCHAR(9),
[Weekday] VARCHAR(9),
[Weekday Number] INT)
WITH(
DATA_SOURCE = [tfwpbstore_ADLSG2],
LOCATION = '/Generic Datasets/Date Dimension.csv',
FILE_FORMAT = csvFile);
但是,当我尝试从外部 table SELECT 时,出现以下错误
Msg 107090, Level 16, State 1, Line 1
HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: HadoopSqlException: Error converting data type VARCHAR to DATETIME.
而且我不太确定哪里出了问题。如果有人能提供帮助,我将不胜感激。
正如上面评论中提到的,我需要定义文件格式语句中使用的日期格式,如下所示:
CREATE EXTERNAL FILE FORMAT csvFile_ddMMyyyy_fr2
WITH (
FORMAT_TYPE = DELIMITEDTEXT,
FORMAT_OPTIONS (
FIELD_TERMINATOR = ',',
STRING_DELIMITER = '"',
DATE_FORMAT = 'dd/MM/yyyy',
FIRST_ROW = 2,
USE_TYPE_DEFAULT = TRUE,
ENCODING = 'UTF8' )
);
我正在尝试为存储在 Azure 存储中的 CSV 文件创建外部 table。
CSV 数据如下所示:-
Date | Rail Period | Calendar Year | Calendar Month | Calendar Month Name | Fiscal Year | Fiscal Period | Weekday | Weekday Number |
---|---|---|---|---|---|---|---|---|
26/04/2021 | 2201 | 2021 | 4 | April | 2022 | Period 1 | Monday | 1 |
27/04/2021 | 2201 | 2021 | 4 | April | 2022 | Period 1 | Tuesday | 2 |
28/04/2021 | 2201 | 2021 | 4 | April | 2022 | Period 1 | Wednesday | 3 |
29/04/2021 | 2201 | 2021 | 4 | April | 2022 | Period 1 | Thursday | 4 |
30/04/2021 | 2201 | 2021 | 4 | April | 2022 | Period 1 | Friday | 5 |
01/05/2021 | 2201 | 2021 | 5 | May | 2022 | Period 2 | Saturday | 6 |
02/05/2021 | 2202 | 2021 | 5 | May | 2022 | Period 2 | Sunday | 7 |
03/05/2021 | 2202 | 2021 | 5 | May | 2022 | Period 2 | Monday | 1 |
04/05/2021 | 2202 | 2021 | 5 | May | 2022 | Period 2 | Tuesday | 2 |
我使用以下代码创建了外部文件格式
CREATE EXTERNAL FILE FORMAT csvFile
WITH (
FORMAT_TYPE = DELIMITEDTEXT,
FORMAT_OPTIONS (
FIELD_TERMINATOR = ',',
STRING_DELIMITER = '"',
FIRST_ROW = 2,
USE_TYPE_DEFAULT = TRUE,
ENCODING = 'UTF8' )
);
和外部table如下
CREATE EXTERNAL TABLE ext.DateDimension(
[Date] DATE,
[Rail Period] INT,
[Calendar Year] INT,
[Calendar Month] INT,
[Calendar Month Name] VARCHAR(9),
[Fiscal Year] INT,
[Fiscal Period] VARCHAR(9),
[Weekday] VARCHAR(9),
[Weekday Number] INT)
WITH(
DATA_SOURCE = [tfwpbstore_ADLSG2],
LOCATION = '/Generic Datasets/Date Dimension.csv',
FILE_FORMAT = csvFile);
但是,当我尝试从外部 table SELECT 时,出现以下错误
Msg 107090, Level 16, State 1, Line 1 HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: HadoopSqlException: Error converting data type VARCHAR to DATETIME.
而且我不太确定哪里出了问题。如果有人能提供帮助,我将不胜感激。
正如上面评论中提到的,我需要定义文件格式语句中使用的日期格式,如下所示:
CREATE EXTERNAL FILE FORMAT csvFile_ddMMyyyy_fr2
WITH (
FORMAT_TYPE = DELIMITEDTEXT,
FORMAT_OPTIONS (
FIELD_TERMINATOR = ',',
STRING_DELIMITER = '"',
DATE_FORMAT = 'dd/MM/yyyy',
FIRST_ROW = 2,
USE_TYPE_DEFAULT = TRUE,
ENCODING = 'UTF8' )
);