dplyr + RPostgreSQL 字符串匹配区分大小写

Question

我不知道如何使用 dplyr 将不区分大小写的过滤器查询应用于远程 PostgreSQL table。演示：

require(dplyr)
require(stringr)
require(RPostgreSQL)

drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv, dbname="mydb", host="localhost", port=5432, user="username")

# create db table
copy_to(con, iris, "iris", temporary = FALSE)

# dplyr remote database table
iris_pg <- tbl(con, "iris")

iris_pg %>% filter(str_detect(Species, 'setosa')) %>% head(3) %>% collect()
# A tibble: 3 x 5
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
*        <dbl>       <dbl>        <dbl>       <dbl> <chr>  
1          5.1         3.5          1.4         0.2 setosa 
2          4.9         3            1.4         0.2 setosa 
3          4.7         3.2          1.3         0.2 setosa

iris_pg %>% filter(str_detect(Species, 'Setosa')) %>% head(3) %>% collect()
# A tibble: 0 x 0

忽略大小写 stringr::fixed('Setosa', ignore_case=TRUE) 适用于 tibble 过滤。但是对于 postgres table 它没有效果：

iris_pg %>% filter(str_detect(Species, stringr::fixed('SETOSA', ignore_case=TRUE))) %>% head(3) %>% collect()
# A tibble: 0 x 0

有人知道解决方法吗？

Answer 1

它不起作用，因为如您所见here，当使用 PostgreSQL 后端时，dbplyr 依赖区分大小写的函数 STRPOS 来翻译str_detect 变成 SQL.

一些可能的解决方法：

1) filter(str_detect(tolower(myvar), tolower(pattern))) 可能适用于任何关系数据库。

2) filter(myvar %~*% pattern) 依赖于 ~*，PostgreSQL 运算符用于不区分大小写的 POSIX 正则表达式。

3) filter(myvar %ilike% paste0("%", pattern, "%")) 依赖于 ILIKE，标准 LIKE 运算符的不区分大小写且特定于 Postgres 的版本。

dplyr + RPostgreSQL 字符串匹配区分大小写

Case-sensitivity with dplyr + RPostgreSQL string matching

r

rpostgresql

dplyr