Hive 日期字符串验证
Hive Date String Validation
我正在尝试检查字符串是否为“YYYYMMDD”的有效日期格式。
我正在使用以下技术。但是对于无效的日期字符串,我得到了有效的日期结果。
我做错了什么?
SELECT'20019999',CASE WHEN unix_timestamp('20019999','YYYYMMDD') > 0 THEN 'Good'ELSE 'Bad'END;
首先,你使用的格式不对
select from_unixtime(unix_timestamp()) as default_format
,from_unixtime(unix_timestamp(),'YYYY-MM-DD') as wrong_format
,from_unixtime(unix_timestamp(),'yyyy-MM-dd') as right_format
;
+----------------------+---------------+---------------+
| default_format | wrong_format | right_format |
+----------------------+---------------+---------------+
| 2017-10-07 04:13:26 | 2017-10-280 | 2017-10-07 |
+----------------------+---------------+---------------+
其次,没有对日期部分范围进行验证。
如果您将日期部分增加 1,它会将您转到第二天。
with t as (select stack(7,'27','28','29','30','31','32','33') as dy)
select t.dy
,from_unixtime(unix_timestamp(concat('2017-02-',t.dy),'yyyy-MM-dd'),'yyyy-MM-dd') as dt
from t
;
+-----+-------------+
| dy | dt |
+-----+-------------+
| 27 | 2017-02-27 |
| 28 | 2017-02-28 |
| 29 | 2017-03-01 |
| 30 | 2017-03-02 |
| 31 | 2017-03-03 |
| 32 | 2017-03-04 |
| 33 | 2017-03-05 |
+-----+-------------+
如果您将月份部分增加 1,它会将您转到下一个月。
with t as (select stack(5,'10','11','12','13','14') as mn)
select t.mn
,from_unixtime(unix_timestamp(concat('2017-',t.mn,'-01'),'yyyy-MM-dd'),'yyyy-MM-dd') as dt
from t
;
+-----+-------------+
| mn | dt |
+-----+-------------+
| 10 | 2017-10-01 |
| 11 | 2017-11-01 |
| 12 | 2017-12-01 |
| 13 | 2018-01-01 |
| 14 | 2018-02-01 |
+-----+-------------+
即使使用 CAST,验证也只针对零件范围而不是日期本身。
select cast('2010-02-32' as date);
+-------+
| _c0 |
+-------+
| NULL |
+-------+
select cast('2010-02-29' as date);
+-------------+
| _c0 |
+-------------+
| 2010-03-01 |
+-------------+
这是实现目标的方法:
with t as (select '20019999' as dt)
select dt
,from_unixtime(unix_timestamp(dt,'yyyyMMdd'),'yyyyMMdd') as double_converted_dt
,case
when from_unixtime(unix_timestamp(dt,'yyyyMMdd'),'yyyyMMdd') = dt
then 'Good'
else 'Bad'
end as dt_status
from t
;
+-----------+----------------------+------------+
| dt | double_converted_dt | dt_status |
+-----------+----------------------+------------+
| 20019999 | 20090607 | Bad |
+-----------+----------------------+------------+
我正在尝试检查字符串是否为“YYYYMMDD”的有效日期格式。
我正在使用以下技术。但是对于无效的日期字符串,我得到了有效的日期结果。
我做错了什么?
SELECT'20019999',CASE WHEN unix_timestamp('20019999','YYYYMMDD') > 0 THEN 'Good'ELSE 'Bad'END;
首先,你使用的格式不对
select from_unixtime(unix_timestamp()) as default_format
,from_unixtime(unix_timestamp(),'YYYY-MM-DD') as wrong_format
,from_unixtime(unix_timestamp(),'yyyy-MM-dd') as right_format
;
+----------------------+---------------+---------------+
| default_format | wrong_format | right_format |
+----------------------+---------------+---------------+
| 2017-10-07 04:13:26 | 2017-10-280 | 2017-10-07 |
+----------------------+---------------+---------------+
其次,没有对日期部分范围进行验证。
如果您将日期部分增加 1,它会将您转到第二天。
with t as (select stack(7,'27','28','29','30','31','32','33') as dy)
select t.dy
,from_unixtime(unix_timestamp(concat('2017-02-',t.dy),'yyyy-MM-dd'),'yyyy-MM-dd') as dt
from t
;
+-----+-------------+
| dy | dt |
+-----+-------------+
| 27 | 2017-02-27 |
| 28 | 2017-02-28 |
| 29 | 2017-03-01 |
| 30 | 2017-03-02 |
| 31 | 2017-03-03 |
| 32 | 2017-03-04 |
| 33 | 2017-03-05 |
+-----+-------------+
如果您将月份部分增加 1,它会将您转到下一个月。
with t as (select stack(5,'10','11','12','13','14') as mn)
select t.mn
,from_unixtime(unix_timestamp(concat('2017-',t.mn,'-01'),'yyyy-MM-dd'),'yyyy-MM-dd') as dt
from t
;
+-----+-------------+
| mn | dt |
+-----+-------------+
| 10 | 2017-10-01 |
| 11 | 2017-11-01 |
| 12 | 2017-12-01 |
| 13 | 2018-01-01 |
| 14 | 2018-02-01 |
+-----+-------------+
即使使用 CAST,验证也只针对零件范围而不是日期本身。
select cast('2010-02-32' as date);
+-------+
| _c0 |
+-------+
| NULL |
+-------+
select cast('2010-02-29' as date);
+-------------+
| _c0 |
+-------------+
| 2010-03-01 |
+-------------+
这是实现目标的方法:
with t as (select '20019999' as dt)
select dt
,from_unixtime(unix_timestamp(dt,'yyyyMMdd'),'yyyyMMdd') as double_converted_dt
,case
when from_unixtime(unix_timestamp(dt,'yyyyMMdd'),'yyyyMMdd') = dt
then 'Good'
else 'Bad'
end as dt_status
from t
;
+-----------+----------------------+------------+
| dt | double_converted_dt | dt_status |
+-----------+----------------------+------------+
| 20019999 | 20090607 | Bad |
+-----------+----------------------+------------+