BigQuery 比较 DATE 和 TIMESTAMP

BigQuery comparing DATE and TIMESTAMP

这是我在 MySQL 中使用的示例。但是,在 BigQuery 中,我的 OnSite timestampDATE 而我的文档时间戳是 TIMESTAMP.

BigQuery 在处理以下查询时遇到问题,因为我收到消息:

No matching signature for function DATE for argument types: DATE. Supported signatures: DATE(TIMESTAMP, [STRING]); DATE(DATETIME); DATE(INT64, INT64, INT64) at [8:146]

有谁知道我需要做什么才能使查询与比较 DATE 和 TIMESTAMP 一起工作?

架构 (MySQL v5.7)

CREATE TABLE OnSite
    (`uid` varchar(55), `worksite_id`  varchar(55), `timestamp` datetime)
;

INSERT INTO OnSite
    (`uid`, `worksite_id`, `timestamp`)
VALUES
  ("u12345", "worksite_1", '2019-01-01'),
  ("u12345", "worksite_1", '2019-01-02'),
  ("u12345", "worksite_1", '2019-01-03'),
  ("u12345", "worksite_1", '2019-01-04'),
  ("u12345", "worksite_1", '2019-01-05'),
  ("u12345", "worksite_1", '2019-01-06'),
  ("u1", "worksite_1", '2019-01-01'),
  ("u1", "worksite_1", '2019-01-02'),
  ("u1", "worksite_1", '2019-01-05'),
  ("u1", "worksite_1", '2019-01-06')

;


CREATE TABLE Documents
    (`document_id` varchar(55), `uid` varchar(55), `worksite_id`  varchar(55), `type` varchar(55), `timestamp` datetime)
;

INSERT INTO Documents
    (`document_id`, `uid`, `worksite_id`, `type`, `timestamp`)

VALUES
  ("1",     "u12345",   "worksite_1", 'work_permit',    '2019-01-01 00:00:00'),
  ("2",     "u12345",   "worksite_2", 'job',            '2019-01-02 00:00:00'),
  ("3",     "u12345",   "worksite_1", 'work_permit',    '2019-01-03 00:00:00'),
  ("4",     "u12345",   "worksite_2", 'job',            '2019-01-04 00:00:00'),
  ("5",     "u12345",   "worksite_1", 'work_permit',    '2019-01-05 00:00:00'),
  ("6",     "u12345",   "worksite_2", 'job',            '2019-01-06 00:00:00'),
  ("7",     "u12345",   "worksite_1", 'work_permit',    '2019-01-07 00:00:00'),
  ("8",     "u12345",   "worksite_2", 'work_permit',    '2019-01-09 00:00:00'),
  ("9",     "u12345",   "worksite_1", 'job',            '2019-01-09 00:00:00'),
  ("10",    "u12345",   "worksite_2", 'work_permit',    '2019-01-09 00:00:00'),
  ("11",    "u12345",   "worksite_1", 'work_permit',    '2019-01-09 00:00:00'),
  ("12",    "u12345",   "worksite_2", 'work_permit',    '2019-01-09 00:00:00'),
  ("13",    "u12345",   "worksite_1", 'job',            '2019-01-09 00:00:00'),
  ("14",    "u12345",   "worksite_2", 'work_permit',    '2019-01-09 00:00:00'),
  ("15",    "u12345",   "worksite_1", 'work_permit',    '2019-01-09 00:00:00')

;

查询#1

SELECT
  IFNULL(OnSite.worksite_id, Documents.worksite_id) as `Worksite`,
  DATE(IFNULL(OnSite.timestamp, Documents.timestamp)) as `Date`,
  COUNT(Documents.worksite_id) as `Users_on_Site`,
  COUNT(DISTINCT OnSite.uid) as `Completed`

FROM OnSite
  LEFT JOIN Documents ON OnSite.worksite_id = Documents.worksite_id AND DATE(OnSite.timestamp) = DATE(Documents.timestamp)
GROUP BY `Date`, `Worksite`;

| Worksite   | Date       | Users_on_Site | Completed |
| ---------- | ---------- | ------------- | --------- |
| worksite_1 | 2019-01-01 | 2             | 2         |
| worksite_1 | 2019-01-02 | 0             | 2         |
| worksite_1 | 2019-01-03 | 1             | 1         |
| worksite_1 | 2019-01-04 | 0             | 1         |
| worksite_1 | 2019-01-05 | 2             | 2         |
| worksite_1 | 2019-01-06 | 0             | 2         |

View on DB Fiddle

BigQuery documentation中解释说DATE函数接受以下输入:

  1. DATE(year, month, day) : Constructs a DATE from INT64 values representing the year, month, and day.

  2. DATE(timestamp_expression[, timezone]) : Converts a timestamp_expression to a DATE data type. It supports an optional parameter to specify a timezone. If no timezone is specified, the default timezone, UTC, is used.

在您的用例中,您传递给 DATE 的值似乎已经是一个日期时间。为此,您可以使用 DATETIME_TRUNC,例如:

DATETIME_TRUNC(IFNULL(OnSite.timestamp, Documents.timestamp), DAY)

以下适用于 BigQuery 标准 SQL

#standardSQL
SELECT
  IFNULL(OnSite.worksite_id, Documents.worksite_id) AS `Worksite`,
  IFNULL(OnSite.timestamp, DATE(Documents.timestamp)) AS `DATE`,
  COUNT(Documents.worksite_id) AS `Users_on_Site`,
  COUNT(DISTINCT OnSite.uid) AS `Completed`
FROM `project.dataset.OnSite` OnSite
LEFT JOIN `project.dataset.Documents` Documents 
ON OnSite.worksite_id = Documents.worksite_id 
AND OnSite.timestamp = DATE(Documents.timestamp)
GROUP BY `DATE`, `Worksite`

如果应用于您问题中的样本数据

WITH `project.dataset.OnSite` AS (
  SELECT "u12345" uid, "worksite_1" worksite_id, DATE '2019-01-01' `TIMESTAMP` UNION ALL
  SELECT "u12345", "worksite_1", '2019-01-02' UNION ALL
  SELECT "u12345", "worksite_1", '2019-01-03' UNION ALL
  SELECT "u12345", "worksite_1", '2019-01-04' UNION ALL
  SELECT "u12345", "worksite_1", '2019-01-05' UNION ALL
  SELECT "u12345", "worksite_1", '2019-01-06' UNION ALL
  SELECT "u1", "worksite_1", '2019-01-01' UNION ALL
  SELECT "u1", "worksite_1", '2019-01-02' UNION ALL
  SELECT "u1", "worksite_1", '2019-01-05' UNION ALL
  SELECT "u1", "worksite_1", '2019-01-06' 
), `project.dataset.Documents` AS (
  SELECT "1" document_id,     "u12345" uid,   "worksite_1" worksite_id, 'work_permit' type,    TIMESTAMP '2019-01-01 00:00:00' `TIMESTAMP` UNION ALL
  SELECT "2",     "u12345",   "worksite_2", 'job',            '2019-01-02 00:00:00' UNION ALL
  SELECT "3",     "u12345",   "worksite_1", 'work_permit',    '2019-01-03 00:00:00' UNION ALL
  SELECT "4",     "u12345",   "worksite_2", 'job',            '2019-01-04 00:00:00' UNION ALL
  SELECT "5",     "u12345",   "worksite_1", 'work_permit',    '2019-01-05 00:00:00' UNION ALL
  SELECT "6",     "u12345",   "worksite_2", 'job',            '2019-01-06 00:00:00' UNION ALL
  SELECT "7",     "u12345",   "worksite_1", 'work_permit',    '2019-01-07 00:00:00' UNION ALL
  SELECT "8",     "u12345",   "worksite_2", 'work_permit',    '2019-01-09 00:00:00' UNION ALL
  SELECT "9",     "u12345",   "worksite_1", 'job',            '2019-01-09 00:00:00' UNION ALL
  SELECT "10",    "u12345",   "worksite_2", 'work_permit',    '2019-01-09 00:00:00' UNION ALL
  SELECT "11",    "u12345",   "worksite_1", 'work_permit',    '2019-01-09 00:00:00' UNION ALL
  SELECT "12",    "u12345",   "worksite_2", 'work_permit',    '2019-01-09 00:00:00' UNION ALL
  SELECT "13",    "u12345",   "worksite_1", 'job',            '2019-01-09 00:00:00' UNION ALL
  SELECT "14",    "u12345",   "worksite_2", 'work_permit',    '2019-01-09 00:00:00' UNION ALL
  SELECT "15",    "u12345",   "worksite_1", 'work_permit',    '2019-01-09 00:00:00' 
)

结果如预期

Row Worksite    Date        Users_on_Site   Completed    
1   worksite_1  2019-01-01  2               2    
2   worksite_1  2019-01-02  0               2    
3   worksite_1  2019-01-03  1               1    
4   worksite_1  2019-01-04  0               1    
5   worksite_1  2019-01-05  2               2    
6   worksite_1  2019-01-06  0               2    

你为什么不强制施放一切,让生活更轻松 :-)?所有这些都应该有效:

select 
   date(timestamp('2019-01-02')), 
   date(timestamp('2019-01-02 00:00:00')), 
   date(timestamp(null)))

因此,在您的 if null 语句中:

SELECT
  IFNULL(OnSite.worksite_id, Documents.worksite_id) as `Worksite`,
  IFNULL(date(datetime(OnSite.timestamp)),date(datetime(Documents.timestamp))) as `Date`,
  COUNT(Documents.worksite_id) as `Users_on_Site`,
  COUNT(DISTINCT OnSite.uid) as `Completed`
FROM OnSite
  LEFT JOIN Documents ON OnSite.worksite_id = Documents.worksite_id AND DATE(datetime(OnSite.timestamp)) = DATE(datetime(Documents.timestamp))
GROUP BY `Date`, `Worksite`;