当 Biopython 读取连接树时,行中缺少 ','

Missing ',' in line when Biopython reads a nexus tree

我想编辑我从 BEAST2 treeannotator 获得的 nexus 格式的树。 通常我使用 Biopython 中的模块 Phylo 进行此类工作,但 Phylo.read(r"filename.tree", "nexus") 给了我下一个例外:

---------------------------------------------------------------------------
NexusError                                Traceback (most recent call last)
Input In [29], in <cell line: 1>()
----> 1 Phylo.read(r"filename.tree", "nexus")

File ~\miniconda3\lib\site-packages\Bio\Phylo\_io.py:60, in read(file, format, **kwargs)
     58 try:
     59     tree_gen = parse(file, format, **kwargs)
---> 60     tree = next(tree_gen)
     61 except StopIteration:
     62     raise ValueError("There are no trees in this file.") from None

File ~\miniconda3\lib\site-packages\Bio\Phylo\_io.py:49, in parse(file, format, **kwargs)
     34 """Parse a file iteratively, and yield each of the trees it contains.
     35 
     36 If a file only contains one tree, this still returns an iterable object that
   (...)
     46 
     47 """
     48 with File.as_handle(file) as fp:
---> 49     yield from getattr(supported_formats[format], "parse")(fp, **kwargs)

File ~\miniconda3\lib\site-packages\Bio\Phylo\NexusIO.py:40, in parse(handle)
     32 def parse(handle):
     33     """Parse the trees in a Nexus file.
     34 
     35     Uses the old Nexus.Trees parser to extract the trees, converts them back to
   (...)
     38     eventually change Nexus to use the new NewickIO parser directly.)
     39     """
---> 40     nex = Nexus.Nexus(handle)
     42     # NB: Once Nexus.Trees is modified to use Tree.Newick objects, do this:
     43     # return iter(nex.trees)
     44     # Until then, convert the Nexus.Trees.Tree object hierarchy:
     45     def node2clade(nxtree, node):

File ~\miniconda3\lib\site-packages\Bio\Nexus\Nexus.py:668, in Nexus.__init__(self, input)
    665 self.options["gapmode"] = "missing"
    667 if input:
--> 668     self.read(input)
    669 else:
    670     self.read(DEFAULTNEXUS)

File ~\miniconda3\lib\site-packages\Bio\Nexus\Nexus.py:718, in Nexus.read(self, input)
    716     break
    717 if title in KNOWN_NEXUS_BLOCKS:
--> 718     self._parse_nexus_block(title, contents)
    719 else:
    720     self._unknown_nexus_block(title, contents)

File ~\miniconda3\lib\site-packages\Bio\Nexus\Nexus.py:759, in Nexus._parse_nexus_block(self, title, contents)
    757 for line in block.commandlines:
    758     try:
--> 759         getattr(self, "_" + line.command)(line.options)
    760     except AttributeError:
    761         raise NexusError("Unknown command: %s " % line.command) from None

File ~\miniconda3\lib\site-packages\Bio\Nexus\Nexus.py:1144, in Nexus._translate(self, options)
   1142         break
   1143     elif c != ",":
-> 1144         raise NexusError("Missing ',' in line %s." % options)
   1145 except NexusError:
   1146     raise

NexusError: Missing ',' in line 1 AB298157.1_2015_-7.9133750332192605_114.8086828279248, 2 AB298158.1_2007_-8.41698974207…

使用Nexus.read(Nexus(), input=r"filename.tree")得到了同样的结果。请问有人可以帮忙吗?我无法理解此错误的原因,因为 nexus file 看起来是正确的。

原因是 Biopython 无法读取带有链接的连结树、来自翻译的成分和 newick 树。所以需要事先将其转换为全名形式进入树中(如下)。

Begin
    tree TREE1 = (((your,tree),(in,(the, newick))),format);
End;

P.S。在 newick 格式中允许用引号将标签括起来,& 一些程序或脚本将它们添加到那些具有歧义字符的名称中。但在接下来的系统发育分析中,它可能会导致异常,例如,在 BEAST 中。我希望你对此小心。