SQL 在 R 中使用基于数据框中的值的多个 AND 语句进行查询
SQL query in R using multiple AND statements based on values in a dataframe
我过去曾使用 sprintf() 根据用户选择的 ID 列表创建 sql 语句来做到这一点。但是现在我想根据用户从多个表中选择的多个字段列表创建一个 sql 语句,但我不知所措。
我有一个包含多个表的数据库:
TBL1
Date Program Name Type height width
5/22 E7 Square angle 5 5
5/22 H9 Circle smooth 4 4
9/9 E7 Circle smooth 7 7
10/10 R8 Triangle angle 10 5
TBL2
Date Program Name Value1 Value2
5/22 E7 Square 5 2.4
5/22 H9 Circle 10 43
9/9 E7 Circle 3.2 9
10/10 R8 Triangle 999 1
TBL3
Type 1 2 3
angle a g h
smooth b c d
我有一个要从数据库中获取的行的数据框
df
Date Program Name
5/22 E7 Square
9/9 E7 Circle
10/10 R8 Triangle
我需要动态生成 SQL 语句来收集我需要的数据。因此,我需要为 df 中的每个日期、程序和名称获取日期、程序、名称、类型、值 1 和值 2。
我想弄清楚 sqlinterpolate 是否可以处理这个问题,但它似乎不能?
我会加入 TBL1 和 TBL3 WHERE Type=Type,然后加入 TBL2 的 VALUES,所有这些都只针对日期、程序和名称与我的 df 匹配的行。
sql return 的期望输出:
Date Program Name Type 1 2 3 Value1 Value2
5/22 E7 Square angle a g h 5 2.4
9/9 E7 Circle smooth b c d 3.2 9
10/10 R8 Triangle angle a g h 999 1
想法?
您可以将 df 拆分成行,为每一行创建一个查询,按 'UNION ALL\n' 折叠查询,然后获取:
library(RSQLite)
library(DBI)
library(tidyverse)
library(glue)
#>
#> Attaching package: 'glue'
#> The following object is masked from 'package:dplyr':
#>
#> collapse
t1 <- tibble(date = c("5/22", "5/22", "9/9", "10/10"),
program = c("E7", "H9", "E7", "R8"),
name = c("Square", "Circle", "Circle", "Triangle"),
type = c("angle", "smooth", "smooth", "angle"),
height = c(5, 4, 7, 10),
width = c(5, 4, 5, 5))
t2 <- tibble(date = c("5/22", "5/22", "9/9", "10/10"),
program = c("E7", "H9", "E7", "R8"),
name = c("Square", "Circle", "Circle", "Triangle"),
val1 = c(5, 10, 3.2, 999),
val2 = c(2.4, 43, 9, 1))
t3 <- tibble(type = c("angle", "smooth"),
one = c("a", "b"),
two = c("g", "c"),
three = c("h", "d"))
df <- tibble(date = c("5/22", "9/9", "10/10"),
program = c("E7", "E7", "R8"),
name = c("Square", "Circle", "Triangle"))
con <- dbConnect(RSQLite::SQLite(), ":memory:")
dbWriteTable(con, "tab1", t1)
dbWriteTable(con, "tab2", t2)
dbWriteTable(con, "tab3", t3)
queries <- df %>%
split(seq_len(nrow(.))) %>%
map(~{
d <- .x$date
p <- .x$program
n <- .x$name
glue('SELECT t1.date,
t1.program,
t1.name,
t1.type,
t2.val1,
t2.val2,
t3.one,
t3.two,
t3.three
FROM tab1 t1
LEFT JOIN tab2 t2 ON t1.date = t2.date AND t1.program = t2.program
LEFT JOIN tab3 t3 ON t1.type = t3.type
WHERE t1.date = "{d}"
AND t1.program = "{p}"
AND t1.name = "{n}"')
})
q <- paste(queries, collapse = "UNION ALL\n")
dbGetQuery(con, q)
#> date program name type val1 val2 one two three
#> 1 5/22 E7 Square angle 5.0 2.4 a g h
#> 2 9/9 E7 Circle smooth 3.2 9.0 b c d
#> 3 10/10 R8 Triangle angle 999.0 1.0 a g h
由 reprex package (v0.3.0)
于 2020-04-03 创建
我过去曾使用 sprintf() 根据用户选择的 ID 列表创建 sql 语句来做到这一点。但是现在我想根据用户从多个表中选择的多个字段列表创建一个 sql 语句,但我不知所措。
我有一个包含多个表的数据库:
TBL1
Date Program Name Type height width
5/22 E7 Square angle 5 5
5/22 H9 Circle smooth 4 4
9/9 E7 Circle smooth 7 7
10/10 R8 Triangle angle 10 5
TBL2
Date Program Name Value1 Value2
5/22 E7 Square 5 2.4
5/22 H9 Circle 10 43
9/9 E7 Circle 3.2 9
10/10 R8 Triangle 999 1
TBL3
Type 1 2 3
angle a g h
smooth b c d
我有一个要从数据库中获取的行的数据框
df
Date Program Name
5/22 E7 Square
9/9 E7 Circle
10/10 R8 Triangle
我需要动态生成 SQL 语句来收集我需要的数据。因此,我需要为 df 中的每个日期、程序和名称获取日期、程序、名称、类型、值 1 和值 2。
我想弄清楚 sqlinterpolate 是否可以处理这个问题,但它似乎不能?
我会加入 TBL1 和 TBL3 WHERE Type=Type,然后加入 TBL2 的 VALUES,所有这些都只针对日期、程序和名称与我的 df 匹配的行。
sql return 的期望输出:
Date Program Name Type 1 2 3 Value1 Value2
5/22 E7 Square angle a g h 5 2.4
9/9 E7 Circle smooth b c d 3.2 9
10/10 R8 Triangle angle a g h 999 1
想法?
您可以将 df 拆分成行,为每一行创建一个查询,按 'UNION ALL\n' 折叠查询,然后获取:
library(RSQLite)
library(DBI)
library(tidyverse)
library(glue)
#>
#> Attaching package: 'glue'
#> The following object is masked from 'package:dplyr':
#>
#> collapse
t1 <- tibble(date = c("5/22", "5/22", "9/9", "10/10"),
program = c("E7", "H9", "E7", "R8"),
name = c("Square", "Circle", "Circle", "Triangle"),
type = c("angle", "smooth", "smooth", "angle"),
height = c(5, 4, 7, 10),
width = c(5, 4, 5, 5))
t2 <- tibble(date = c("5/22", "5/22", "9/9", "10/10"),
program = c("E7", "H9", "E7", "R8"),
name = c("Square", "Circle", "Circle", "Triangle"),
val1 = c(5, 10, 3.2, 999),
val2 = c(2.4, 43, 9, 1))
t3 <- tibble(type = c("angle", "smooth"),
one = c("a", "b"),
two = c("g", "c"),
three = c("h", "d"))
df <- tibble(date = c("5/22", "9/9", "10/10"),
program = c("E7", "E7", "R8"),
name = c("Square", "Circle", "Triangle"))
con <- dbConnect(RSQLite::SQLite(), ":memory:")
dbWriteTable(con, "tab1", t1)
dbWriteTable(con, "tab2", t2)
dbWriteTable(con, "tab3", t3)
queries <- df %>%
split(seq_len(nrow(.))) %>%
map(~{
d <- .x$date
p <- .x$program
n <- .x$name
glue('SELECT t1.date,
t1.program,
t1.name,
t1.type,
t2.val1,
t2.val2,
t3.one,
t3.two,
t3.three
FROM tab1 t1
LEFT JOIN tab2 t2 ON t1.date = t2.date AND t1.program = t2.program
LEFT JOIN tab3 t3 ON t1.type = t3.type
WHERE t1.date = "{d}"
AND t1.program = "{p}"
AND t1.name = "{n}"')
})
q <- paste(queries, collapse = "UNION ALL\n")
dbGetQuery(con, q)
#> date program name type val1 val2 one two three
#> 1 5/22 E7 Square angle 5.0 2.4 a g h
#> 2 9/9 E7 Circle smooth 3.2 9.0 b c d
#> 3 10/10 R8 Triangle angle 999.0 1.0 a g h
由 reprex package (v0.3.0)
于 2020-04-03 创建