Return 只有 Snowflake SQL 查询到 R 的最后一条语句

Return only last statement from Snowflake SQL query to R

我有一个 Snowflake SQL 查询,我正尝试通过 ODBC 连接在 R 中执行,如下所示

SET quiet=TRUE;

USE SOMEDATABASE.SOMESCHEMA;

--Select timestamp of last sale per customer
DROP TABLE IF EXISTS sales;
CREATE TEMPORARY TABLE sales(CustomerId VARCHAR(16777216), SaleTS TIMESTAMP_NTZ(9));

INSERT INTO sales
SELECT CustomerId, 
       SaleTS
FROM SALES
WHERE SaleTS>= '2020-11-19 00:00:00'
AND SaleTS <= '2020-11-19 23:59:59.999'
GROUP BY CustomerId;

--Use temp table to get correct row from sales table
SELECT  SUM(SalesDetail.price) as SumPrice
        COUNT(*) as SoldVolume
FROM sales
LEFT JOIN SALES as SalesDetail
    ON Sales.CustomerId = SalesDetail.CustomerId 
    AND sales.SaleTS = SalesDetail.SaleTS 

从 R 查询 Microsoft SQL 服务器 我通常会在查询的顶部包含 set nocount no; 以确保只将最后一步返回到 R 以避免错误 Actual statement count 6 did not match the desired statement count 1.错误是有道理的,当 R 期望 1 时 SQL 返回 6 个组件(我的 SQL 查询中的每个步骤 6 个组件)。在 Snowflake 中,似乎没有以相同方式设置 nocount on 的选项。我的问题是如何避免上述错误。有没有人有通过 R 执行多步 Snowflake SQL 查询的经验?我怎样才能让 R 只接收来自 ODBC 连接的最后一条语句。到目前为止,我已经尝试了 set nocount=TRUE;set echo=FALSE;set message=FALSE; SET quiet=TRUE

Snowflake SQL 具有足够的表现力,建议的代码可以构造为单个查询:

WITH cte AS (
    SELECT CustomerId, MAX(SaleTS) AS SaleTS  -- here agg function is required
    FROM SALES
    WHERE SaleTS>= '2020-11-19 00:00:00'
    AND SaleTS <= '2020-11-19 23:59:59.999'
    GROUP BY CustomerId
)
SELECT  SUM(SalesDetail.price) as SumPrice
        COUNT(*) as SoldVolume
FROM cte
LEFT JOIN SALES as SalesDetail
    ON Sales.CustomerId = SalesDetail.CustomerId 
    AND sales.SaleTS = SalesDetail.SaleTS;

原始查询对 table 和临时 table 使用相同的名称,仅大小写 salesSALES 不同,这很容易出错。

其次:数据库和模式可以在建立连接时设置,所以不需要USE inside script。或者,可以在脚本中使用完全限定名称。


我猜查询的意图如下:

WITH cte AS (
  SELECT *
  FROM SOMEDATABASE.SOMESCHEMA.SALES
  WHERE SaleTS BETWEEN '2020-11-19 00:00:00' AND '2020-11-19 23:59:59.999'
  QUALIFY ROW_NUMBER() OVER(PARTITION BY CustomerId ORDER BY SaleTS DESC) = 1
)
SELECT COUNT(*) AS SoldVolume, SUM(price) as SumPrice
FROM cte;

如果一个人可能有两个完全相同的 SaleTS 条目,则应改用 RANK() OVER(...)