为什么 sqldf 函数看不到条件?

Why doesn't sqldf function see the condition?

我在 R studio 中使用 sqldf 函数进行数据操作。 我遇到了一个问题,sqldf 函数没有考虑我在查询中设置的条件。 例如代码

d2 <- sqldf("select a.'n1.0', a.date, a.tot_cap*100/a.'n1.0' as afasf from data a where a.date < '2013-03-01'") 

给出与代码

完全相同的结果
d2 <- sqldf("select a.'n1.0', a.date, a.tot_cap*100/a.'n1.0' as afasf from data a").

设置单引号、双引号、完全不加引号都没有用。

有人知道问题出在哪里以及如何解决吗?

提前致谢。

这是 dput(data2) 命令的结果,其中 data2 是初始数据集的前 40 行。

structure(list(REGN = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), date = structure(c(14761, 
14791, 14822, 14853, 14883, 14914, 14944, 14975, 15006, 15034, 
15065, 15095, 15126, 15156, 15187, 15218, 15248, 15279, 15309, 
15706, 15737, 15765, 15796, 15826, 15857, 15887, 15918, 15949, 
15979, 16010, 16040, 16071, 16102, 16130, 16161, 16191, 16222, 
16252, 16283, 16314), class = "Date"), tot_cap = structure(c(3.29680129097603e-316, 
3.27881038454837e-316, 3.31569035934127e-316, 3.42544269478722e-316, 
3.46449744773999e-316, 3.76857311386686e-316, 3.83375272419446e-316, 
3.89781752479879e-316, 3.88106524094525e-316, 3.91119504385185e-316, 
3.54278081534629e-316, 3.66904833254234e-316, 3.84160896084211e-316, 
3.83063042693901e-316, 3.87586035817945e-316, 3.98758302742101e-316, 
4.04557250040469e-316, 4.19023427922167e-316, 4.27870651560933e-316, 
5.55371435659506e-316, 5.45611929687006e-316, 5.73639651252866e-316, 
5.74625569130456e-316, 5.74576631928235e-316, 5.79667637503333e-316, 
5.86047635645149e-316, 5.66957630781748e-316, 6.16857297583704e-316, 
6.25638237375425e-316, 6.32633446059442e-316, 6.54992194176386e-316, 
6.44650520772079e-316, 6.41496840466797e-316, 6.51734725500882e-316, 
6.52890122716964e-316, 6.49014508749322e-316, 6.47399926921995e-316, 
6.49043322657787e-316, 6.47938176858544e-316, 6.65456810876004e-316
), class = "integer64"), bas_cap = structure(c(0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 5.24913731265079e-316, 5.71723269425777e-316, 
5.71723269425777e-316, 5.72079658738749e-316, 5.71723269425777e-316, 
5.71723269425777e-316, 5.71723269425777e-316, 5.84030513832873e-316
), class = "integer64"), osn_cap = structure(c(0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 5.24913731265079e-316, 5.71723269425777e-316, 
5.71723269425777e-316, 5.72079658738749e-316, 5.71723269425777e-316, 
5.71723269425777e-316, 5.71723269425777e-316, 5.84030513832873e-316
), class = "integer64"), n1.0 = c(NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, 14.13, 13.91, 13.74, 13.25, 
13.11, 13.59, 13.07, 13.06), n1.1 = c(NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 11.65, 12.3, 12.11, 11.75, 
11.65, 12.04, 11.61, 11.53), n1.2 = c(NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 11.65, 12.3, 12.12, 11.77, 
11.66, 12.06, 11.62, 11.54), n1 = c(13.96, 13.66, 15.07, 15.75, 
15.49, 15.87, 15.7, 16.32, 16.17, 16.67, 14.97, 15.06, 15.22, 
15.02, 14.68, 14.42, 13.5, 13.29, 13.1, 13.25, 12.53, 13.28, 
13.35, 13.42, 13.67, 14.14, 13.1, 14.13, 14.13, 14.55, 14.53, 
14.51, NA, NA, NA, NA, NA, NA, NA, NA)), row.names = c(NA, 40L
), class = "data.frame")

这里有几个问题:

  • 单引号用于引用常量。使用双引号来引用变量名。

  • SQLite 不支持 Date 类型,因此日期被读取为数字(自 UNIX 纪元以来的天数),然后该语句试图将该数字与字符串进行比较。

  • 虽然没有错,但您实际上并不需要别名 a,因为只有一个数据框,所以不会混淆所引用的内容。

  • 问题是指data2,但R代码中使用了data。我们已将 R 代码更改为使用 data2.

有关更多信息,请参阅问题下评论中提供的链接。

SQLite

使用默认的 SQLite 后端,我们在 sqldf 前面加上 fn$ 以使其成为 运行 R 中反引号之间的任何内容,并用 [=51] 的结果替换该表达式=].

d2 <- fn$sqldf("select \"n1.0\", date, tot_cap * 100/ \"n1.0\" as afasf 
  from data2 
  where date < `as.Date('2013-03-01')`") 

H2

或者,使用 H2 后端。 (与 SQLite 一样,H2 数据库包含在 R 驱动程序中,因此您不必单独安装它,但您需要确保已安装 Java——幸运的是,这非常简单,因为它有一个自动安装程序。)该数据库确实有日期类型,并且能够将此类日期与表示日期的正确格式化字符串进行比较。

library(RH2)  # <----------note

d3 <- fn$sqldf("select \"n1.0\", date, tot_cap * 100 / \"n1.0\" as afasf 
  from data2
  where date < '2013-03-01'") 

如果您想继续使用 SQLite,请确保从搜索路径中分离 RH2。