处理外部文件时出错:'Inserting value to batch for column type DATE failed. Invalid argument provided.'

Error handling external file: 'Inserting value to batch for column type DATE failed. Invalid argument provided.'

我正在阅读 parquet 文件,发现导致此问题的两列:

SELECT
TOP 100 
POST_SEARCH_DATE,
SEARCH_DATE 

FROM
OPENROWSET(
    BULK 'https://test/refined-parquet/data/v1.0/loaddt=2022-01-01/**',
    FORMAT = 'PARQUET'
    
) AS [result]

这些列包含未定义的值,我尝试了以下方法来消除错误,但它仍然存在。

SELECT
TOP 100 
POST_SEARCH_DATE,
SEARCH_DATE

FROM
OPENROWSET(
    BULK 'https://test/refined-parquet/data/v1.0/loaddt=2022-01-01/**',
    FORMAT = 'PARQUET'
    
) AS [result]
WHERE cast(POST_SEARCH_DEPARTURE_DATE as varchar(100))!= 'undefined'
and cast (SEARCH_DEPARTURE_DATE as varchar(100)) != 'undefined'

上面的代码将抛出相同的内容:

Error handling external file: 'Inserting value to batch for column type DATE failed. Invalid argument provided.'.

我还为一列尝试了以下操作:

SELECT
TOP 100 
POST_SEARCH_DATE,
SEARCH_DATE

FROM
OPENROWSET(
    BULK 'https://test/refined-parquet/data/v1.0/loaddt=2022-01-01/**',
    FORMAT = 'PARQUET'
    
) with (POST_SEARCH_DATEVARCHAR(60)) AS [result]

但这会抛出以下内容:

Error handling external file: 'Converting value to batch for column POST_SEARCH_DEPARTURE_DATE failed. Invalid argument provided.'

按照指示,无法从此原始文件中删除任何行,我不确定如何解决此问题。欢迎任何建议。

解决方案是使用 with 子句并定义您希望在执行打开行集时接收的数据类型(见下文):

SELECT
  TOP 100 
  POST_SEARCH_DATE,
  SEARCH_DATE

FROM
   OPENROWSET(
      BULK 'https://test/refined-parquet/data/v1.0/loaddt=2021-01-01/**',
      FORMAT = 'PARQUET'
     WITH (
        [SEARCH_DEPARTURE_DATE] VARCHAR(100) COLLATE Latin1_General_BIN2 3,
        [POST_SEARCH_DEPARTURE_DATE] VARCHAR(100) COLLATE Latin1_General_BIN2 2
) AS [result]