使用 Entrez 解析来自 PubMed 的出版数据的问题
Issue with parsing publication data from PubMed with Entrez
我正在尝试使用 Entrez 将发布数据导入数据库。搜索部分工作正常,但是当我尝试解析时:
from Bio import Entrez
def create_publication(pmid):
handle = Entrez.efetch("pubmed", id=pmid, retmode="xml")
records = Entrez.parse(handle)
item_data = records.next()
handle.close()
...我收到以下错误:
File
"/venv/lib/python2.7/site-packages/Bio/Entrez/Parser.py",
line 296, in parse
raise ValueError("The XML file does not represent a list. Please use Entrez.read instead of Entrez.parse") ValueError: The XML file
does not represent a list. Please use Entrez.read instead of
Entrez.parse
这段代码在几天前一直有效。知道这里可能出了什么问题吗?
此外,查看源代码 (http://biopython.org/DIST/docs/api/Bio.Entrez-pysrc.html) 并尝试按照列出的示例进行操作,给出了相同的错误:
from Bio import Entrez
Entrez.email = "Your.Name.Here@example.org"
handle = Entrez.efetch("pubmed", id="19304878,14630660", retmode="xml")
records = Entrez.parse(handle)
for record in records:
print(record['MedlineCitation']['Article']['ArticleTitle'])
handle.close()
如其他评论和 GitHub Issue 中所述,该问题是由 NCBI Entrez Utilities Developers 故意更改引起的。如 Jhird 在本期中所述,您可以将代码更改为以下内容:
from Bio import Entrez
Entrez.email = "Your.Name.Here@example.org"
handle = Entrez.efetch("pubmed", id="19304878,14630660", retmode="xml")
records = Entrez.read(handle) # Difference here
records = records['PubmedArticle'] # New line here
for record in records:
print(record['MedlineCitation']['Article']['ArticleTitle'])
handle.close()
我正在尝试使用 Entrez 将发布数据导入数据库。搜索部分工作正常,但是当我尝试解析时:
from Bio import Entrez
def create_publication(pmid):
handle = Entrez.efetch("pubmed", id=pmid, retmode="xml")
records = Entrez.parse(handle)
item_data = records.next()
handle.close()
...我收到以下错误:
File "/venv/lib/python2.7/site-packages/Bio/Entrez/Parser.py", line 296, in parse raise ValueError("The XML file does not represent a list. Please use Entrez.read instead of Entrez.parse") ValueError: The XML file does not represent a list. Please use Entrez.read instead of Entrez.parse
这段代码在几天前一直有效。知道这里可能出了什么问题吗?
此外,查看源代码 (http://biopython.org/DIST/docs/api/Bio.Entrez-pysrc.html) 并尝试按照列出的示例进行操作,给出了相同的错误:
from Bio import Entrez
Entrez.email = "Your.Name.Here@example.org"
handle = Entrez.efetch("pubmed", id="19304878,14630660", retmode="xml")
records = Entrez.parse(handle)
for record in records:
print(record['MedlineCitation']['Article']['ArticleTitle'])
handle.close()
如其他评论和 GitHub Issue 中所述,该问题是由 NCBI Entrez Utilities Developers 故意更改引起的。如 Jhird 在本期中所述,您可以将代码更改为以下内容:
from Bio import Entrez
Entrez.email = "Your.Name.Here@example.org"
handle = Entrez.efetch("pubmed", id="19304878,14630660", retmode="xml")
records = Entrez.read(handle) # Difference here
records = records['PubmedArticle'] # New line here
for record in records:
print(record['MedlineCitation']['Article']['ArticleTitle'])
handle.close()