从数据库中检索数据的函数出错
error with a function to retrieve data from a database
我正在尝试从 NCBI 网站获取 FASTA 文件,我使用以下函数
getncbiseq <- function(accession){
dbs <- c()
for (i in 1:numdbs){
db <- dbs[i]
choosebank(db)
resquery <- try(query(".tmpquery", paste("AC=", accession)),silent = TRUE)
if (!(inherits(resquery, "try-error"))){
queryname <- "query2"
thequery <- paste("AC=",accession,sep="")
query(`queryname`,`thequery`)
# see if a sequence was retrieved:
seq <- getSequence(query2$req[[1]])
closebank()
return(seq)
}
closebank()
}
print(paste("ERROR: accession",accession,"was not found"))
}
当我尝试检索序列时
mydata <- getncbiseq("NC_001477")
Error in getSequence(query2$req[[1]]) : object 'query2' not found
还有没有更好的方法来缩短这些循环功能?
如果我使用
query('queryname','the query')
#or
query("queryname","thequery")
我收到另一个错误
Error in query("queryname", "thequery") :
invalid request:"unknown list at (^): \"(^)thequery\""
我认为您打算将对 query()
的调用分配给一个名为 query2
的变量,但您忘记了这样做。试试这个:
if (!(inherits(resquery, "try-error"))) {
queryname <- "query2"
thequery <- paste("AC=", accession, sep="")
query2 <- query(queryname, thequery)
# see if a sequence was retrieved:
seq <- getSequence(query2$req[[1]])
closebank()
return(seq)
}
正如您所提到的,您的其余代码也有一些可能需要改进的怪癖和问题。
更新:
这是在 dbs
向量上使用 sapply
而不是显式 for 循环(R 人通常不赞成后者)的代码重构:
processdbs <- function(x, y) {
choosebank(x)
resquery <- try(query(".tmpquery", paste("AC=", y)), silent = TRUE)
if (!(inherits(resquery, "try-error"))) {
queryname <- "query2"
thequery <- paste("AC=", y, sep="")
query2 <- query(queryname, thequery)
# see if a sequence was retrieved:
seq <- getSequence(query2$req[[1]])
closebank()
return(seq)
}
closebank()
}
getncbiseq <- function(accession) {
dbs <- c("genbank","refseq","refseqViruses","bacterial")
result <- sapply(dbs, processdbs, y=accession)
closebank()
print(paste("ERROR: accession",accession,"was not found"))
}
您可能需要做一些额外的工作来检查 result
向量并确定是否在任何地方检索到序列。
感谢您的大力帮助。我在这一点上被困了一整天。我终于在 windows 10 和 R3.4.0(32 位)下运行了以下代码:-
getncbiseq <- function(accession)
{
require("seqinr") # this function requires the SeqinR R package
# first find which ACNUC database the accession is stored in:
dbs <- c("genbank","refseq","refseqViruses","bacterial")
numdbs <- length(dbs)
for (i in 1:numdbs)
{
db <- dbs[i]
choosebank(db)
# check if the sequence is in ACNUC database 'db':
resquery <- try(query(".tmpquery", paste("AC=", accession)), silent = TRUE)
if (!(inherits(resquery, "try-error"))) {
queryname <- "query2"
thequery <- paste("AC=", accession, sep="")
query2 <- query(queryname, thequery)
# see if a sequence was retrieved:
seq <- getSequence(query2$req[[1]])
closebank()
return(seq)
}
closebank()
}
print(paste("ERROR: accession",accession,"was not found"))
}
我正在尝试从 NCBI 网站获取 FASTA 文件,我使用以下函数
getncbiseq <- function(accession){
dbs <- c()
for (i in 1:numdbs){
db <- dbs[i]
choosebank(db)
resquery <- try(query(".tmpquery", paste("AC=", accession)),silent = TRUE)
if (!(inherits(resquery, "try-error"))){
queryname <- "query2"
thequery <- paste("AC=",accession,sep="")
query(`queryname`,`thequery`)
# see if a sequence was retrieved:
seq <- getSequence(query2$req[[1]])
closebank()
return(seq)
}
closebank()
}
print(paste("ERROR: accession",accession,"was not found"))
}
当我尝试检索序列时
mydata <- getncbiseq("NC_001477")
Error in getSequence(query2$req[[1]]) : object 'query2' not found
还有没有更好的方法来缩短这些循环功能?
如果我使用
query('queryname','the query')
#or
query("queryname","thequery")
我收到另一个错误
Error in query("queryname", "thequery") : invalid request:"unknown list at (^): \"(^)thequery\""
我认为您打算将对 query()
的调用分配给一个名为 query2
的变量,但您忘记了这样做。试试这个:
if (!(inherits(resquery, "try-error"))) {
queryname <- "query2"
thequery <- paste("AC=", accession, sep="")
query2 <- query(queryname, thequery)
# see if a sequence was retrieved:
seq <- getSequence(query2$req[[1]])
closebank()
return(seq)
}
正如您所提到的,您的其余代码也有一些可能需要改进的怪癖和问题。
更新:
这是在 dbs
向量上使用 sapply
而不是显式 for 循环(R 人通常不赞成后者)的代码重构:
processdbs <- function(x, y) {
choosebank(x)
resquery <- try(query(".tmpquery", paste("AC=", y)), silent = TRUE)
if (!(inherits(resquery, "try-error"))) {
queryname <- "query2"
thequery <- paste("AC=", y, sep="")
query2 <- query(queryname, thequery)
# see if a sequence was retrieved:
seq <- getSequence(query2$req[[1]])
closebank()
return(seq)
}
closebank()
}
getncbiseq <- function(accession) {
dbs <- c("genbank","refseq","refseqViruses","bacterial")
result <- sapply(dbs, processdbs, y=accession)
closebank()
print(paste("ERROR: accession",accession,"was not found"))
}
您可能需要做一些额外的工作来检查 result
向量并确定是否在任何地方检索到序列。
感谢您的大力帮助。我在这一点上被困了一整天。我终于在 windows 10 和 R3.4.0(32 位)下运行了以下代码:-
getncbiseq <- function(accession)
{
require("seqinr") # this function requires the SeqinR R package
# first find which ACNUC database the accession is stored in:
dbs <- c("genbank","refseq","refseqViruses","bacterial")
numdbs <- length(dbs)
for (i in 1:numdbs)
{
db <- dbs[i]
choosebank(db)
# check if the sequence is in ACNUC database 'db':
resquery <- try(query(".tmpquery", paste("AC=", accession)), silent = TRUE)
if (!(inherits(resquery, "try-error"))) {
queryname <- "query2"
thequery <- paste("AC=", accession, sep="")
query2 <- query(queryname, thequery)
# see if a sequence was retrieved:
seq <- getSequence(query2$req[[1]])
closebank()
return(seq)
}
closebank()
}
print(paste("ERROR: accession",accession,"was not found"))
}