当日期时间为字符串时,使用 Impala 找到提取最近 7 天的最佳方法
Finding the best way to pull the last 7 days with Impala when datetime is string
我正在尝试使用我们开始引入的数据集,当然 "devicereceipttime" 存储为字符串,我现在无法说服任何人更改它。然而,"year"、"month"、"day" 和 "hour" 被分解为单独的字段,如 "ints"。看起来像下面这样:
devicereceipttime(string) year(int) month(int) day(int) hour(int)
2018-06-19T05:00:06.265Z 2018 6 19 5
2018-06-19T18:53:56.776Z 2018 6 19 6
2018-06-19T02:10:05.252Z 2018 6 19 2
2018-06-19T12:14:01.395Z 2018 6 19 12
我正在使用 Impala 并且想要 运行 一个类似于下面的查询,但是可以使用上面的类型的查询,只需使用 "devicereceipttime" 字符串值或者"y/m/d" 整数。我希望捕捉整整一周(连续 7 天),所以我可能会安排在周六或周一向 CDSW 的 运行 报告。
这是我在日期时间字符串格式为 "yyyy-mm-dd hh:mm:ss"
时使用的查询
select *
from winworkstations_realtime
where devicereceipttime BETWEEN concat(to_date(now() - interval 1 days), " 00:00:00") and concat(to_date(now() - interval 8 days), " 24:00:00")
使用字符串还是尝试用一堆整数来计算更好?
我想出了这个来满足查询:
devicereceipttime BETWEEN concat(to_date(now() - interval 7 days), "T00:00:00.000Z") and concat(to_date(now() - interval 1 days), "T23:59:59.999Z")
select w.destinationhostname,w.destinationusername, w.destinationprocessname, count(*) as count \
from winworkstations_realtime w \
where w.devicereceipttime BETWEEN concat(to_date(now() - interval 7 days), "T00:00:00.000Z") and concat(to_date(now() - interval 1 days), "T23:59:59.999Z") AND w.externalid="4688" AND w.destinationhostname like "T%" AND (w.destinationusername not like "%$" AND w.destinationusername not like "LOCAL%" AND w.destinationusername not like "-") \
group by w.destinationhostname, w.destinationusername,w.destinationprocessname \
order by 1,2'
我正在尝试使用我们开始引入的数据集,当然 "devicereceipttime" 存储为字符串,我现在无法说服任何人更改它。然而,"year"、"month"、"day" 和 "hour" 被分解为单独的字段,如 "ints"。看起来像下面这样:
devicereceipttime(string) year(int) month(int) day(int) hour(int)
2018-06-19T05:00:06.265Z 2018 6 19 5
2018-06-19T18:53:56.776Z 2018 6 19 6
2018-06-19T02:10:05.252Z 2018 6 19 2
2018-06-19T12:14:01.395Z 2018 6 19 12
我正在使用 Impala 并且想要 运行 一个类似于下面的查询,但是可以使用上面的类型的查询,只需使用 "devicereceipttime" 字符串值或者"y/m/d" 整数。我希望捕捉整整一周(连续 7 天),所以我可能会安排在周六或周一向 CDSW 的 运行 报告。
这是我在日期时间字符串格式为 "yyyy-mm-dd hh:mm:ss"
时使用的查询select *
from winworkstations_realtime
where devicereceipttime BETWEEN concat(to_date(now() - interval 1 days), " 00:00:00") and concat(to_date(now() - interval 8 days), " 24:00:00")
使用字符串还是尝试用一堆整数来计算更好?
我想出了这个来满足查询:
devicereceipttime BETWEEN concat(to_date(now() - interval 7 days), "T00:00:00.000Z") and concat(to_date(now() - interval 1 days), "T23:59:59.999Z")
select w.destinationhostname,w.destinationusername, w.destinationprocessname, count(*) as count \
from winworkstations_realtime w \
where w.devicereceipttime BETWEEN concat(to_date(now() - interval 7 days), "T00:00:00.000Z") and concat(to_date(now() - interval 1 days), "T23:59:59.999Z") AND w.externalid="4688" AND w.destinationhostname like "T%" AND (w.destinationusername not like "%$" AND w.destinationusername not like "LOCAL%" AND w.destinationusername not like "-") \
group by w.destinationhostname, w.destinationusername,w.destinationprocessname \
order by 1,2'