使用 glue_sql() 并避免粘贴动态 SELECT 语句的方法?
Way to use glue_sql() and avoid paste in dynamic SELECT statement?
我正在学习如何从 R 查询 SQLite 数据库,并使用 glue_sql()
构建这些查询。下面是我的工作流程中子查询的一个简化示例。有没有一种方法可以在不使用 paste0()
的情况下创建 s10_wtX
和 s20_wtX
,如下面的代码所示?
library(DBI)
library(dplyr)
library(glue)
# example database
set.seed(1)
ps <- data.frame(plot = rep(1:3, each = 4),
spp = rep(1:3*10, 2),
wtX = rnorm(12, 10, 2) %>% round(1))
con <- dbConnect(RSQLite::SQLite(), "")
dbWriteTable(con, "ps", ps)
# species of interest
our_spp <- c(10, 20)
# for the spp of interest, sum wtX on each plot
sq <- glue_sql(paste0(
'SELECT ps.plot,\n',
paste0('SUM(CASE WHEN ps.spp = ', our_spp,
' THEN (ps.wtX) END) AS s', our_spp,
'_wtX',
collapse = ',\n'), '\n',
' FROM ps
WHERE ps.spp IN ({our_spp*}) -- spp in our sample
GROUP BY ps.plot'),
.con = con)
# the result of the query should look like:
dbGetQuery(con, sq)
plot s10_wtX s20_wtX
1 1 21.9 10.4
2 2 11.0 22.2
3 3 9.4 13.0
在我的实际工作流程中,我感兴趣的物种不止两个,所以我宁愿不把每一行都写出来(例如,SUM(CASE WHEN ps.spp = 10 THEN (ps.wtX) END) AS s10_wtX
)。
为了稍微规范一下(即使它不是您最终使用的),以下是我的详细评论:
out <- DBI::dbGetQuery(con, "
select ps.plot, ps.spp, sum(ps.wtX) as wtX
from ps
where ps.spp in (10,20)
group by ps.plot, ps.spp")
out
# plot spp wtX
# 1 1 10 21.9
# 2 1 20 10.4
# 3 2 10 11.0
# 4 2 20 22.2
# 5 3 10 9.4
# 6 3 20 13.0
这可以很容易地根据您的需要进行调整。例如,使用 tidyr::pivot_wider
,
tidyr::pivot_wider(out, plot, names_from="spp", values_from="wtX")
# # A tibble: 3 x 3
# plot `10` `20`
# <int> <dbl> <dbl>
# 1 1 21.9 10.4
# 2 2 11 22.2
# 3 3 9.4 13
(名称需要清理。)
OP 的原始问题是
Is there a way I can create s10_wtX and s20_wtX without using paste0(), as in the code below?
如果我们只想用glue
构造,也使用glue_collapse
library(glue)
sq1 <- glue_sql('SELECT ps.plot,', glue_collapse(glue('SUM(CASE WHEN ps.spp = {our_spp} THEN (ps.wtX) END) AS s{our_spp}_wtX'), sep = ",\n"), '\nFROM ps\n WHERE ps.spp IN ({our_spp*}) -- spp in our sample\n GROUP BY ps.plot', .con = con)
dbGetQuery(con, sq1)
plot s10_wtX s20_wtX
1 1 21.9 10.4
2 2 11.0 22.2
3 3 9.4 13.0
我正在学习如何从 R 查询 SQLite 数据库,并使用 glue_sql()
构建这些查询。下面是我的工作流程中子查询的一个简化示例。有没有一种方法可以在不使用 paste0()
的情况下创建 s10_wtX
和 s20_wtX
,如下面的代码所示?
library(DBI)
library(dplyr)
library(glue)
# example database
set.seed(1)
ps <- data.frame(plot = rep(1:3, each = 4),
spp = rep(1:3*10, 2),
wtX = rnorm(12, 10, 2) %>% round(1))
con <- dbConnect(RSQLite::SQLite(), "")
dbWriteTable(con, "ps", ps)
# species of interest
our_spp <- c(10, 20)
# for the spp of interest, sum wtX on each plot
sq <- glue_sql(paste0(
'SELECT ps.plot,\n',
paste0('SUM(CASE WHEN ps.spp = ', our_spp,
' THEN (ps.wtX) END) AS s', our_spp,
'_wtX',
collapse = ',\n'), '\n',
' FROM ps
WHERE ps.spp IN ({our_spp*}) -- spp in our sample
GROUP BY ps.plot'),
.con = con)
# the result of the query should look like:
dbGetQuery(con, sq)
plot s10_wtX s20_wtX
1 1 21.9 10.4
2 2 11.0 22.2
3 3 9.4 13.0
在我的实际工作流程中,我感兴趣的物种不止两个,所以我宁愿不把每一行都写出来(例如,SUM(CASE WHEN ps.spp = 10 THEN (ps.wtX) END) AS s10_wtX
)。
为了稍微规范一下(即使它不是您最终使用的),以下是我的详细评论:
out <- DBI::dbGetQuery(con, "
select ps.plot, ps.spp, sum(ps.wtX) as wtX
from ps
where ps.spp in (10,20)
group by ps.plot, ps.spp")
out
# plot spp wtX
# 1 1 10 21.9
# 2 1 20 10.4
# 3 2 10 11.0
# 4 2 20 22.2
# 5 3 10 9.4
# 6 3 20 13.0
这可以很容易地根据您的需要进行调整。例如,使用 tidyr::pivot_wider
,
tidyr::pivot_wider(out, plot, names_from="spp", values_from="wtX")
# # A tibble: 3 x 3
# plot `10` `20`
# <int> <dbl> <dbl>
# 1 1 21.9 10.4
# 2 2 11 22.2
# 3 3 9.4 13
(名称需要清理。)
OP 的原始问题是
Is there a way I can create s10_wtX and s20_wtX without using paste0(), as in the code below?
如果我们只想用glue
构造,也使用glue_collapse
library(glue)
sq1 <- glue_sql('SELECT ps.plot,', glue_collapse(glue('SUM(CASE WHEN ps.spp = {our_spp} THEN (ps.wtX) END) AS s{our_spp}_wtX'), sep = ",\n"), '\nFROM ps\n WHERE ps.spp IN ({our_spp*}) -- spp in our sample\n GROUP BY ps.plot', .con = con)
dbGetQuery(con, sq1)
plot s10_wtX s20_wtX
1 1 21.9 10.4
2 2 11.0 22.2
3 3 9.4 13.0