将向量存储在关系数据库中

storing a vector in a relational database

在我的计算中,我得到了一些存储在向量中的结果。因为这些计算是重复执行的,所以我将有一些向量存储在我的数据库中。 在我的数据库中 table data_storage 每行都应该存储一个结果向量。

直到现在,我发现我的 table 中需要一个 BLOB 变量,并且必须按照 Storing R Objects in a relational database 中所述对向量进行序列化。 在这个提到的来源上,David Josipovic 的回答似乎非常适合我,但我无法正确编码。查看我代码中的数据输入...

EDIT_1:当使用 dbGetQuerydbExecute()dbBind() 时会出现错误消息。序列化错误(res_1.v):缺少连接参数(没有默认值)。

对我来说,重要的是要知道如何在数据库中获取结果向量以及如何将它们取出。 所以我希望你能帮助我。

非常感谢!

我的代码:

# needed packages for this script 
# install.packages("sqldf")  # install this package if necessary
library(sqldf)

# connection to the database
db=dbConnect(SQLite(), ":memory:")

# creation of the database table
dbSendQuery(conn = db,
    "CREATE TABLE IF NOT EXISTS data_storage
    (ID INTEGER,
    data BLOB,
    PRIMARY KEY (ID))")

# calculations
# ....

# the first result vector
res_1.v=seq(from=1,to=10,by=1)

# the second result vector
res_2.v=seq(from=5,to=7,by=0.1)

# filling the data_storage table with the result vectors (in two rows)
### here an error occures
dbGetQuery(db, 'INSERT INTO data_storage VALUES (1,:blob)', params =  list(blob = list(serialize(res_1.v))))  # the first row with res_1.v
dbGetQuery(db, 'INSERT INTO data_storage VALUES (2,:blob)', params = list(blob = list(serialize(res_2.v))))  # the second row with res_2.v

# getting the serialized result vector out of the database
# and converting them again to the result vector res_1.v respectively res_2.v
#######################################
### the still missing code sequence ###
#######################################

# close the connection
dbDisconnect(db) 

可能是你的语法有问题。此 RSQLite documentation page 使用 dbSendStatement 构建用于执行 DML 命令的准备语句:

rs <- dbSendStatement(db, 'INSERT INTO data_storage VALUES (1, :blob)')
dbBind(rs, param = list(blob = serialize(res_1.v)))
dbGetRowsAffected(rs)
dbClearResult(rs)

此答案假定 API 将正确知道如何将 BLOB 绑定到语句中。

感谢两位评论者 Marius 和 Tim Biegeleisen 以及一些时间的反复试验,我找到了解决方案...

第一部分代码没有任何变化

# needed packages for this script 
# install.packages("sqldf")  # install this package if necessary
library(sqldf)

# connection to the database
db=dbConnect(SQLite(), ":memory:")

# creation of the database table
dbSendQuery(conn = db,
    "CREATE TABLE IF NOT EXISTS data_storage
    (ID INTEGER,
    data BLOB,
    PRIMARY KEY (ID))")

# calculations
# ....

# the first result vector
res_1.v=seq(from=1,to=10,by=1)

# the second result vector
res_2.v=seq(from=5,to=7,by=0.1)

现在是代码的第二部分,我在其中更改、添加并完全附加了一些代码行...

# filling the data_storage table with the result vectors (in two rows)
### here you can/must use dbExecute() as suggested by Marius
### and in list(serialize(res_1.v,NULL)) the connection NULL is important
dbExecute(db, 'INSERT INTO data_storage VALUES (1,:blob)', params = list(blob = list(serialize(res_1.v,NULL))))  # the first row with res_1.v
dbExecute(db, 'INSERT INTO data_storage VALUES (2,:blob)', params = list(blob = list(serialize(res_2.v,NULL))))  # the second row with res_2.v

# reading out the content of table data_storage
dbReadTable(db,"data_storage")

# It's nearly the same - reading out the content of data_storage
dbGetQuery(db,'SELECT * FROM data_storage')
dbGetQuery(db,'SELECT * FROM data_storage')[,1]  # the content of the first column
dbGetQuery(db,'SELECT * FROM data_storage')[,2]  # the content of the second column - this is a BLOB

# get the result vector with its original entries
### and unlist() the BLOB entry
### and finally unserialize the unlisted BLOB entry
step_1=unlist(dbGetQuery(db,'SELECT * FROM data_storage')[1,2])  # here you must adjust the row index
step_2=unserialize(step_1)

# for control of equality
### step_2 is the converted res_1.v to BLOB and the reconverted
### so the outcome of identical() is TRUE
identical(res_1.v,step_2)

# close the connection
dbDisconnect(db)