如何从 dplyr 管道创建 SQL 服务器 table

How to create SQL Server table from dplyr pipeline

由于 a bug in dbplyrcopy_tocompute 目前无法用于 SQL 服务器连接。

connStr <- "driver=ODBC Driver 13 for SQL Server;server=localhost;..."
db <- DBI::dbConnect(odbc::odbc(), .connection_string=connStr)

copy_to(db, mtcars)
#Error: <SQL> 'CREATE TEMPORARY TABLE "mtcars" (
#  "row_names" varchar(255),
#  "mpg" FLOAT,
#  ...
#  nanodbc/nanodbc.cpp:1587: 42000: [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Unknown object type 'TEMPORARY' used in a CREATE, DROP, or ALTER statement. 

# use raw DBI functionality to create table
DBI::dbWriteTable(db, "mtcars", mtcars)

qry <- tbl(db, "mtcars") %>% group_by(am) %>% summarise(m=mean(mpg))

compute(qry)
#Error: <SQL> 'CREATE TEMPORARY TABLE "isrxofsskr" AS SELECT "am" AS "am", "m" #AS "m"
#FROM (SELECT "am", AVG("mpg") AS "m"
#FROM "mtcars"
#GROUP BY "am") "htrkkxabrn"'
#  nanodbc/nanodbc.cpp:1587: 42000: [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Unknown object type 'TEMPORARY' used in a CREATE, DROP, or ALTER statement. 

dbplyr 存储库上有一个有效的 PR 解决了这个问题,但没有说明何时合并(或何时到达 CRAN)。同时,如何在不将数据读入 R 的情况下从查询创建 table?

事实证明,dbplyr 存储库上的 PR 无论如何都会出现故障,并且会在写回之前将整个 table 拉入内存。

解决该问题需要为 dbplyr 泛型创建几个特定于 MSSQL 的方法。下面列出了这些。我还将它们发布到 dbplyr 存储库中,因此(假设它们有效)它们应该很快就会被合并。

#' @export
`db_compute.Microsoft SQL Server` <- function(con, table, sql, temporary=TRUE,
     unique_indexes=list(), indexes=list(), ...)
{
    # check that name has prefixed '##' if temporary
    if(temporary && substr(table, 1, 1) != "#")
        table <- paste0("##", table)

    if(!is.list(indexes))
        indexes <- as.list(indexes)

    if(!is.list(unique_indexes))
        unique_indexes <- as.list(unique_indexes)

    db_save_query(con, sql, table, temporary=temporary)
    db_create_indexes(con, table, unique_indexes, unique=TRUE)
    db_create_indexes(con, table, indexes, unique=FALSE)
    table
}


#' @export
`db_save_query.Microsoft SQL Server` <- function(con, sql, name, temporary=TRUE, ...)
{
    # check that name has prefixed '##' if temporary
    if(temporary && substr(name, 1, 1) != "#")
        name <- paste0("##", name)

    tt_sql <- build_sql("SELECT * INTO ", ident_q(name),
                        " FROM (", sql, ") ", ident_q(name), con=con)

    dbExecute(con, tt_sql)
    name
}

注意:可能无法抵抗 Bobby Tables。建议进行测试。