为什么在 R 中使用 DBI 时 NULL 值被强制为负整数

Why are NULL values being coerced to a negative integer when using DBI in R

我正在使用以下脚本在 R 中使用 DBI 连接到 SQLite 数据库

db = "/Path/To/Database/Foo.db"
obsTable = "obs"
obsQryStr <- paste('select * from ', obsTable)

con <-  dbConnect(RSQLite::SQLite(), dbname = db)
importedData <- dbGetQuery( con, obsQryStr)
dbDisconnect(con)

问题中的 table 有许多整数值列。这是 head(importedData,12) 的输出:

               time subject encounter location temp          hr sbp         dbp rr        spo2 o2Log avpu gcs concern
1  2010-01-01 08:00       2         1        1   NA          97 113          66 12         100     1    A  15       0
2  2010-01-01 08:15       2         1        1 36.2          95 110          62 12         100     1    A  15       0
3  2010-01-01 08:30       2         1        1 36.2          84  90          61 12         100     1    A  15       0
4  2010-01-01 08:45       2         1        1 36.2          80  96          55 12         100     1    A  15       0
5  2010-01-01 09:00       2         1        1 36.2          77  88          51 12         100     0    A  15       0
6  2010-01-01 09:15       2         1        1 36.3          75  91          50 12         100     0    A  15       0
7  2010-01-01 09:30       2         1        1 36.3          76  92          52 12         100     1    A  15       0
8  2010-01-01 10:00       2         1        1 36.4          73  91          52 12         100     0    A  15       0
9  2010-01-01 10:30       2         1        1 36.5          71  91          51 12         100     1    A  15       0
10 2010-01-01 11:30       2         1        1 36.6          69  92          53 12         100     1    A  15       0
11 2010-01-01 12:30       2         1        1 36.6          76 118          63 14         100     1    A  15       0
12 2010-01-01 13:00       2         1        1   NA -2147483648  NA -2147483648 NA -2147483648     1    A  15       0

如您在第 12 行中所见,对于某些列,NULL 值已替换为 -2147483648 而不是 NA。为什么会发生这种情况,我该如何阻止它?

相应行的SQL是:

CREATE TABLE IF NOT EXISTS `sim` (
    `time`  TEXT,
    `subject`   INTEGER,
    `encounter` INTEGER,
    `location`  INTEGER,
    `temp`  REAL,
    `hr`    INTEGER,
    `sbp`   INTEGER,
    `dbp`   INTEGER,
    `rr`    INTEGER,
    `spo2`  INTEGER,
    `o2Log` INTEGER,
    `avpu`  TEXT,
    `gcs`   INTEGER,
    `concern`   INTEGER
);
INSERT INTO `sim` (time,subject,encounter,location,temp,hr,sbp,dbp,rr,spo2,o2Log,avpu,gcs,concern) VALUES ('2010-01-01 08:00',2,1,1,NULL,97,113,66,12,100,1,'A',15,0);
INSERT INTO `sim` (time,subject,encounter,location,temp,hr,sbp,dbp,rr,spo2,o2Log,avpu,gcs,concern) VALUES ('2010-01-01 08:15',2,1,1,36.2,95,110,62,12,100,1,'A',15,0);
INSERT INTO `sim` (time,subject,encounter,location,temp,hr,sbp,dbp,rr,spo2,o2Log,avpu,gcs,concern) VALUES ('2010-01-01 08:30',2,1,1,36.2,84,90,61,12,100,1,'A',15,0);
INSERT INTO `sim` (time,subject,encounter,location,temp,hr,sbp,dbp,rr,spo2,o2Log,avpu,gcs,concern) VALUES ('2010-01-01 08:45',2,1,1,36.2,80,96,55,12,100,1,'A',15,0);
INSERT INTO `sim` (time,subject,encounter,location,temp,hr,sbp,dbp,rr,spo2,o2Log,avpu,gcs,concern) VALUES ('2010-01-01 09:00',2,1,1,36.2,77,88,51,12,100,0,'A',15,0);
INSERT INTO `sim` (time,subject,encounter,location,temp,hr,sbp,dbp,rr,spo2,o2Log,avpu,gcs,concern) VALUES ('2010-01-01 09:15',2,1,1,36.3,75,91,50,12,100,0,'A',15,0);
INSERT INTO `sim` (time,subject,encounter,location,temp,hr,sbp,dbp,rr,spo2,o2Log,avpu,gcs,concern) VALUES ('2010-01-01 09:30',2,1,1,36.3,76,92,52,12,100,1,'A',15,0);
INSERT INTO `sim` (time,subject,encounter,location,temp,hr,sbp,dbp,rr,spo2,o2Log,avpu,gcs,concern) VALUES ('2010-01-01 10:00',2,1,1,36.4,73,91,52,12,100,0,'A',15,0);
INSERT INTO `sim` (time,subject,encounter,location,temp,hr,sbp,dbp,rr,spo2,o2Log,avpu,gcs,concern) VALUES ('2010-01-01 10:30',2,1,1,36.5,71,91,51,12,100,1,'A',15,0);
INSERT INTO `sim` (time,subject,encounter,location,temp,hr,sbp,dbp,rr,spo2,o2Log,avpu,gcs,concern) VALUES ('2010-01-01 11:30',2,1,1,36.6,69,92,53,12,100,1,'A',15,0);
INSERT INTO `sim` (time,subject,encounter,location,temp,hr,sbp,dbp,rr,spo2,o2Log,avpu,gcs,concern) VALUES ('2010-01-01 12:30',2,1,1,36.6,76,118,63,14,100,1,'A',15,0);
INSERT INTO `sim` (time,subject,encounter,location,temp,hr,sbp,dbp,rr,spo2,o2Log,avpu,gcs,concern) VALUES ('2010-01-01 13:00',2,1,1,NULL,NULL,NULL,NULL,NULL,NULL,1,'A',15,0);

我得到的一个线索是受影响的列由于某种原因从 SQLite 数据库中的首选类型 INT 转换为数字。 sapply(importedData, class) 产生以下输出:

    time     subject   encounter    location        temp          hr         sbp         dbp          rr        spo2 
"character"   "integer"   "integer"   "integer"   "numeric"   "numeric"   "integer"   "numeric"   "integer"   "numeric" 
      o2Log        avpu         gcs     concern 
  "integer" "character"   "integer"   "integer"

这看起来是 RSQLite 中的错误。此处报道:https://github.com/r-dbi/RSQLite/issues/291

并在 2.1.2 中修复:https://github.com/r-dbi/RSQLite/releases/tag/v2.1.2