对 SQL 数据库进行迭代查询
Making an iterative query to an SQL database
我正在尝试从 R 查询 MySQL 数据库。查询遍历列表,因此会动态更改。基于列表中每个元素的每个查询通常也会导致提取多行。我正在使用的数据库可以从这里下载:http://www.ghtorrent.org/msr14.html
一天结束时,所有结果都应该以相同的输出结束,如下所示:
pull_req_id,user,action,created_at
12359,arthurnn,opened,1380126837
12359,rafaelfranca,discussed,1380127245
12359,arthurnn,discussed,1380127676
...
我现在的代码看起来像这样,但它不起作用:
library(DBI)
library(RMySQL)
m <- dbDriver("MySQL");
con <- dbConnect(m, user='msr14', password='msr14', host='localhost', dbname='msr14');
all_rails_projects <- dbGetQuery(con, 'SELECT * FROM projects WHERE name = "rails";')
all_rails_prs <- dbGetQuery(con, 'SELECT id FROM pull_requests WHERE base_repo_id = 78852;')
out <- nrow(all_rails_prs)
names(out) <- as.list(all_rails_prs)
df <- c('pull_req_id', 'user', 'action', 'created_at')
out <- numeric(length(df))
names(out) <- df
for (i in nrow(all_rails_prs)) {
SQL <- paste("select user, action, created_at from
(
select prh.action as action, prh.created_at as created_at, u.login as user
from pull_request_history prh, users u
where prh.pull_request_id ='", all_rails_prs[i,], "'",
" and prh.actor_id = u.id
union
select ie.action as action, ie.created_at as created_at, u.login as user
from issues i, issue_events ie, users u
where ie.issue_id = i.id
and i.pull_request_id ='", all_rails_prs[i,], "'",
" and ie.actor_id = u.id
union
select 'discussed' as action, ic.created_at as created_at, u.login as user
from issues i, issue_comments ic, users u
where ic.issue_id = i.id
and u.id = ic.user_id
and i.pull_request_id ='", all_rails_prs[i,], "'",
"union
select 'reviewed' as action, prc.created_at as created_at, u.login as user
from pull_request_comments prc, users u
where prc.user_id = u.id
and prc.pull_request_id ='", all_rails_prs[i,], "'",
") as actions
order by created_at;", sep = "")
res <- dbGetQuery(con, SQL)
out[i] <- dbFetch(res, n = -1)
}
这会生成以下错误消息:
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘dbFetch’ for signature ‘"data.frame"’
In addition: Warning message:
In mysqlExecStatement(conn, statement, ...) :
RS-DBI driver warning: (unrecognized MySQL field type 7 in column 2 imported as character)
我尝试过不同的变体,但它们都会导致某种错误,所以我似乎只是没有以正确的方式设置查询结构。有人有什么建议吗?
根据文档,如果查询成功,dbGetQuery
默认调用 fetch
。
res
已经是一个数据库了,你可以直接把它放到out
里面,不用去取记录。
此外,如果您想将结果存储在数据框中而不是列表中,您可能想尝试:
#get the results
res<-dbGetQuery(con, SQL)
#if it's not null, add the request id and rbind it to the out dataframe
if(!is.null(res)){
out<-rbind(out,cbind(rep(all_rails_prs[i,],nrow(res)),res))
}
您的 for
语法也可能有错误,您可能需要 for (i in 1:nrow(all_rails_prs))
我正在尝试从 R 查询 MySQL 数据库。查询遍历列表,因此会动态更改。基于列表中每个元素的每个查询通常也会导致提取多行。我正在使用的数据库可以从这里下载:http://www.ghtorrent.org/msr14.html
一天结束时,所有结果都应该以相同的输出结束,如下所示:
pull_req_id,user,action,created_at
12359,arthurnn,opened,1380126837
12359,rafaelfranca,discussed,1380127245
12359,arthurnn,discussed,1380127676
...
我现在的代码看起来像这样,但它不起作用:
library(DBI)
library(RMySQL)
m <- dbDriver("MySQL");
con <- dbConnect(m, user='msr14', password='msr14', host='localhost', dbname='msr14');
all_rails_projects <- dbGetQuery(con, 'SELECT * FROM projects WHERE name = "rails";')
all_rails_prs <- dbGetQuery(con, 'SELECT id FROM pull_requests WHERE base_repo_id = 78852;')
out <- nrow(all_rails_prs)
names(out) <- as.list(all_rails_prs)
df <- c('pull_req_id', 'user', 'action', 'created_at')
out <- numeric(length(df))
names(out) <- df
for (i in nrow(all_rails_prs)) {
SQL <- paste("select user, action, created_at from
(
select prh.action as action, prh.created_at as created_at, u.login as user
from pull_request_history prh, users u
where prh.pull_request_id ='", all_rails_prs[i,], "'",
" and prh.actor_id = u.id
union
select ie.action as action, ie.created_at as created_at, u.login as user
from issues i, issue_events ie, users u
where ie.issue_id = i.id
and i.pull_request_id ='", all_rails_prs[i,], "'",
" and ie.actor_id = u.id
union
select 'discussed' as action, ic.created_at as created_at, u.login as user
from issues i, issue_comments ic, users u
where ic.issue_id = i.id
and u.id = ic.user_id
and i.pull_request_id ='", all_rails_prs[i,], "'",
"union
select 'reviewed' as action, prc.created_at as created_at, u.login as user
from pull_request_comments prc, users u
where prc.user_id = u.id
and prc.pull_request_id ='", all_rails_prs[i,], "'",
") as actions
order by created_at;", sep = "")
res <- dbGetQuery(con, SQL)
out[i] <- dbFetch(res, n = -1)
}
这会生成以下错误消息:
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘dbFetch’ for signature ‘"data.frame"’
In addition: Warning message:
In mysqlExecStatement(conn, statement, ...) :
RS-DBI driver warning: (unrecognized MySQL field type 7 in column 2 imported as character)
我尝试过不同的变体,但它们都会导致某种错误,所以我似乎只是没有以正确的方式设置查询结构。有人有什么建议吗?
根据文档,如果查询成功,dbGetQuery
默认调用 fetch
。
res
已经是一个数据库了,你可以直接把它放到out
里面,不用去取记录。
此外,如果您想将结果存储在数据框中而不是列表中,您可能想尝试:
#get the results
res<-dbGetQuery(con, SQL)
#if it's not null, add the request id and rbind it to the out dataframe
if(!is.null(res)){
out<-rbind(out,cbind(rep(all_rails_prs[i,],nrow(res)),res))
}
您的 for
语法也可能有错误,您可能需要 for (i in 1:nrow(all_rails_prs))