Biopython 系统发育树编辑 SVG 文件中的标签
Biopython Phylogenetic Tree Edit labels in SVG file
我尝试使用 python 库 Biopython 和 Phylo.draw().
将全长标签添加到我的系统发育树中
我导入我的 newick 文件格式,我绘制它,然后保存它:
from Bio import Phylo
import pylab
f = 'path/to/my/file'
tree = Phylo.read(file, 'newick')
tree.ladderize()
Phylo.draw(tree, do_show=False)
pylab.axis("off")
pylab.savefig("tree2.svg",format='svg', bbox_inches='tight', dpi=300)
但问题是 Phylo.draw 在标签的 40 个字母处剪切标签。
我可以用这个组合显示全长标签:
for leaf in tree.get_terminals():
print leaf.name
我看了代码源,我看了文档,我没找到这个函数如何切割长标签,或者如何用 full-length 标签替换切割标签,我看到它们是选项label_func=str
或 branch_labels=None
但我不知道如何使用它。
您可以复制粘贴到文本文件中进行测试的数据:
(GSCHCT00002541001_iso_:1.36551607731154023284,((((tr_B6K9U3_B6K9U3_TOXGO_1_4alphaglucanbranching_enzyme_OS=Toxoplasma_gondii_GN=TGVEG_316520_PE=4_SV=1_:0.00000121823973671653,tr_Q5IXJ1_Q5IXJ1_TOXGO_Putative_1_4alphaglucan_branching_enzyme_1_OS=Toxoplasma_gondii_PE=4_SV=1_:0.00000121823973671653)100:0.08309756202092893895,NCLIV_058970___organism=Neospora_caninum_Liverpool___product=GlgB_EC_2_4_1_18_related___location=FR82339249768214985526___length=910___sequence_SO=chromosome___SO=protein_codingLength=910_:0.10501488554905853701)100:2.09488643049910772120,((sp_D2WL32_GLGB3_ARATH_1_4alphaglucanbranching_enzyme_3_chloroplastic/amyloplastic_OS=Arabidopsis_thaliana_GN=SBE3_PE=1_SV=1_:0.00000793499571061110,sp_D2WL323_GLGB3_ARATH_Isoform_3_of_1_4alphaglucanbranching_enzyme_3_chloroplastic/amyloplastic_OS=Arabidopsis_thaliana_GN=SBE3_:0.01062473346874111095)100:1.22585344127521778113,(((((TVAG_276310___Trichomonas_vaginalis_G3___starch_branching_enzyme_II_putative___protein___length=671_:0.26096529219513048270,TVAG_453180___Trichomonas_vaginalis_G3___amylase_putative_=_starch_branching_enzyme___protein___length=671_:0.36958750954860625226)100:0.41842557987087769522,(((Query_116761_1_4alphaglucan_branching_enzyme_Amphimedon_queenslandica__:0.00000121823973671653,jgi_Monbr1_17492_estExt_gwp_gw1_C_30090_1_4alphaglucan_branching_enzyme_Monosiga_brevicollis_MX1__:0.00000121823973671653)100:0.62703546565975865068,gi_602378_gb_AAB64488_1__1_4alphaglucan_branching_enzyme_Saccharomyces_cerevisiae__:0.53686309075793836598)70:0.13386580507453740840,sp_Q555Q9_GLGB_DICDI_1_4alphaglucanbranching_enzyme_OS=Dictyostelium_discoideum_GN=glgB_PE=3_SV=1_:0.47890669361866494702)70:0.10263562958656322066)60:0.09627472451982788115,((DHA2_15823___Giardia_Assemblage_A2_isolate_DH___1_4alphaglucan_branching_enzyme___protein___length=790_:1.38033367488971192572,(gi_403355152_gb_EJY77145_1_putative_1_4alphaglucan_branching_enzyme_from_glycoside_hydrolase_family_GH13_Oxytricha_trifallax__:0.57062195989872910307,gi_403359242_gb_EJY79278_1__putative_1_4alphaglucan_branching_enzyme_from_glycoside_hydrolase_family_GH13_Oxytricha_trifallax__:0.45721715963625586543)100:0.16040553970790100147)60:0.18422255610153490113,(tr_A8J2H1_A8J2H1_CHLRE_Starch_branching_enzyme_Fragment_OS=Chlamydomonas_reinhardtii_GN=SBE1_PE=4_SV=1_:0.70822871603694392828,((tr_A8HW52_A8HW52_CHLRE_Starch_branching_enzyme_OS=Chlamydomonas_reinhardtii_GN=SBE2_PE=4_SV=1_:0.20373019381459359090,tr_A8IHX1_A8IHX1_CHLRE_Starch_branching_enzyme_OS=Chlamydomonas_reinhardtii_GN=SBE3_PE=4_SV=1_:0.19417255052024848250)100:0.36247563752590050701,(sp_O23647_GLGB1_ARATH_1_4alphaglucanbranching_enzyme_21_chloroplastic/amyloplastic_OS=Arabidopsis_thaliana_GN=SBE2_1_PE=2_SV=1_:0.18306459143375086729,sp_Q9LZS3_GLGB2_ARATH_1_4alphaglucanbranching_enzyme_22_chloroplastic/amyloplastic_OS=Arabidopsis_thaliana_GN=SBE2_2_PE=2_SV=1_:0.19172895913289067504)100:0.26486677810195868865)50:0.16685150542870180734)40:0.17590037403691774487)0:0.06573467470539001711)20:0.08495767144039514940,((GSCHCT00008657001_starch_branching_enzyme_:0.41758330756918182747,evm_model_contig_2064_4_SBE_:0.36023201026789519741)100:0.12831089761745814726,CMH144C_branching_enzyme_:0.31146443745380530954)100:0.11773937724705520191)20:0.14200386638651510407,NCLIV_004200___organism=Neospora_caninum_Liverpool___product=hypothetical_protein___location=FR82338115074771519904_____length=1734___sequence_SO=chromosome___SO=protein_codingLength=1734_:1.56986633442481493539)60:0.27085853102669676939)50:0.60075545485436487869)30:0.54380192798189208592,(gi_403347780_gb_EJY73324_1_Putative_alphaamylase_Oxytricha_trifallax__:3.79816205351230751219,(jgi_Guith1_136858_jgi_Guith1_136858_fgenesh2_pg_22_#_123124488_:1.19666825808863852565,(gi_32398951_emb_CAD98416_1__1_4alphaglucan_branching_enzyme_possible_Cryptosporidium_parvum__:1.19101784665038845645,(NCLIV_063470___organism=Neospora_caninum_Liverpool___product=putative_1_4alphaglucan_branching_enzyme___location=FR82339325724112581747_____length=941___sequence_SO=chromosome___SO=protein_codingLength=941_:0.68667357703692166737,((ConsensusfromContig12026snap_maskedConsensusfromContig12026abinitgene0_0mRNA1cds4076/67850_:1.79972121664357276316,ConsensusfromContig6995snap_maskedConsensusfromContig6995abinitgene0_4mRNA1cds527/439144570__:0.92078205301668014648)40:0.21444814008989548926,Cvel_20619___organism=Chromera_velia_CCMP2878___product=Malto-oligosyltrehalose_trehalohydrolase_putative___location=Cvel_scaffold1867_1208-11669_-___length=662___sequence_SO=supercontig___SO=protein_coding_:0.93952895504658817671)20:0.18331222947529093870)70:0.45890345971164558936)80:0.33542172384911861371)100:1.10060499239467945998)30:0.42698796779576941862)100:2.44706655918227600210,sp_Q9M0S5_ISOA3_ARATH_Isoamylase_3_chloroplastic_OS=Arabidopsis_thaliana_GN=ISA3_PE=2_SV=2_:0.91892546724791945856);
编辑:
评论后我编辑标题,将"png"替换为"svg",隐藏轴
并添加图片。
你快到了!您将需要指定 label_func
参数,它是一个函数。像这样:
from Bio import Phylo
import pylab
def get_label(leaf):
return leaf.name
f = 'path/to/my/file'
tree = Phylo.read(f, 'newick')
tree.ladderize()
Phylo.draw(tree, label_func=get_label, do_show=False)
pylab.axis('off')
pylab.savefig('tree2.svg',format='svg', bbox_inches='tight', dpi=300)
我尝试使用 python 库 Biopython 和 Phylo.draw().
将全长标签添加到我的系统发育树中我导入我的 newick 文件格式,我绘制它,然后保存它:
from Bio import Phylo
import pylab
f = 'path/to/my/file'
tree = Phylo.read(file, 'newick')
tree.ladderize()
Phylo.draw(tree, do_show=False)
pylab.axis("off")
pylab.savefig("tree2.svg",format='svg', bbox_inches='tight', dpi=300)
但问题是 Phylo.draw 在标签的 40 个字母处剪切标签。
我可以用这个组合显示全长标签:
for leaf in tree.get_terminals():
print leaf.name
我看了代码源,我看了文档,我没找到这个函数如何切割长标签,或者如何用 full-length 标签替换切割标签,我看到它们是选项label_func=str
或 branch_labels=None
但我不知道如何使用它。
您可以复制粘贴到文本文件中进行测试的数据:
(GSCHCT00002541001_iso_:1.36551607731154023284,((((tr_B6K9U3_B6K9U3_TOXGO_1_4alphaglucanbranching_enzyme_OS=Toxoplasma_gondii_GN=TGVEG_316520_PE=4_SV=1_:0.00000121823973671653,tr_Q5IXJ1_Q5IXJ1_TOXGO_Putative_1_4alphaglucan_branching_enzyme_1_OS=Toxoplasma_gondii_PE=4_SV=1_:0.00000121823973671653)100:0.08309756202092893895,NCLIV_058970___organism=Neospora_caninum_Liverpool___product=GlgB_EC_2_4_1_18_related___location=FR82339249768214985526___length=910___sequence_SO=chromosome___SO=protein_codingLength=910_:0.10501488554905853701)100:2.09488643049910772120,((sp_D2WL32_GLGB3_ARATH_1_4alphaglucanbranching_enzyme_3_chloroplastic/amyloplastic_OS=Arabidopsis_thaliana_GN=SBE3_PE=1_SV=1_:0.00000793499571061110,sp_D2WL323_GLGB3_ARATH_Isoform_3_of_1_4alphaglucanbranching_enzyme_3_chloroplastic/amyloplastic_OS=Arabidopsis_thaliana_GN=SBE3_:0.01062473346874111095)100:1.22585344127521778113,(((((TVAG_276310___Trichomonas_vaginalis_G3___starch_branching_enzyme_II_putative___protein___length=671_:0.26096529219513048270,TVAG_453180___Trichomonas_vaginalis_G3___amylase_putative_=_starch_branching_enzyme___protein___length=671_:0.36958750954860625226)100:0.41842557987087769522,(((Query_116761_1_4alphaglucan_branching_enzyme_Amphimedon_queenslandica__:0.00000121823973671653,jgi_Monbr1_17492_estExt_gwp_gw1_C_30090_1_4alphaglucan_branching_enzyme_Monosiga_brevicollis_MX1__:0.00000121823973671653)100:0.62703546565975865068,gi_602378_gb_AAB64488_1__1_4alphaglucan_branching_enzyme_Saccharomyces_cerevisiae__:0.53686309075793836598)70:0.13386580507453740840,sp_Q555Q9_GLGB_DICDI_1_4alphaglucanbranching_enzyme_OS=Dictyostelium_discoideum_GN=glgB_PE=3_SV=1_:0.47890669361866494702)70:0.10263562958656322066)60:0.09627472451982788115,((DHA2_15823___Giardia_Assemblage_A2_isolate_DH___1_4alphaglucan_branching_enzyme___protein___length=790_:1.38033367488971192572,(gi_403355152_gb_EJY77145_1_putative_1_4alphaglucan_branching_enzyme_from_glycoside_hydrolase_family_GH13_Oxytricha_trifallax__:0.57062195989872910307,gi_403359242_gb_EJY79278_1__putative_1_4alphaglucan_branching_enzyme_from_glycoside_hydrolase_family_GH13_Oxytricha_trifallax__:0.45721715963625586543)100:0.16040553970790100147)60:0.18422255610153490113,(tr_A8J2H1_A8J2H1_CHLRE_Starch_branching_enzyme_Fragment_OS=Chlamydomonas_reinhardtii_GN=SBE1_PE=4_SV=1_:0.70822871603694392828,((tr_A8HW52_A8HW52_CHLRE_Starch_branching_enzyme_OS=Chlamydomonas_reinhardtii_GN=SBE2_PE=4_SV=1_:0.20373019381459359090,tr_A8IHX1_A8IHX1_CHLRE_Starch_branching_enzyme_OS=Chlamydomonas_reinhardtii_GN=SBE3_PE=4_SV=1_:0.19417255052024848250)100:0.36247563752590050701,(sp_O23647_GLGB1_ARATH_1_4alphaglucanbranching_enzyme_21_chloroplastic/amyloplastic_OS=Arabidopsis_thaliana_GN=SBE2_1_PE=2_SV=1_:0.18306459143375086729,sp_Q9LZS3_GLGB2_ARATH_1_4alphaglucanbranching_enzyme_22_chloroplastic/amyloplastic_OS=Arabidopsis_thaliana_GN=SBE2_2_PE=2_SV=1_:0.19172895913289067504)100:0.26486677810195868865)50:0.16685150542870180734)40:0.17590037403691774487)0:0.06573467470539001711)20:0.08495767144039514940,((GSCHCT00008657001_starch_branching_enzyme_:0.41758330756918182747,evm_model_contig_2064_4_SBE_:0.36023201026789519741)100:0.12831089761745814726,CMH144C_branching_enzyme_:0.31146443745380530954)100:0.11773937724705520191)20:0.14200386638651510407,NCLIV_004200___organism=Neospora_caninum_Liverpool___product=hypothetical_protein___location=FR82338115074771519904_____length=1734___sequence_SO=chromosome___SO=protein_codingLength=1734_:1.56986633442481493539)60:0.27085853102669676939)50:0.60075545485436487869)30:0.54380192798189208592,(gi_403347780_gb_EJY73324_1_Putative_alphaamylase_Oxytricha_trifallax__:3.79816205351230751219,(jgi_Guith1_136858_jgi_Guith1_136858_fgenesh2_pg_22_#_123124488_:1.19666825808863852565,(gi_32398951_emb_CAD98416_1__1_4alphaglucan_branching_enzyme_possible_Cryptosporidium_parvum__:1.19101784665038845645,(NCLIV_063470___organism=Neospora_caninum_Liverpool___product=putative_1_4alphaglucan_branching_enzyme___location=FR82339325724112581747_____length=941___sequence_SO=chromosome___SO=protein_codingLength=941_:0.68667357703692166737,((ConsensusfromContig12026snap_maskedConsensusfromContig12026abinitgene0_0mRNA1cds4076/67850_:1.79972121664357276316,ConsensusfromContig6995snap_maskedConsensusfromContig6995abinitgene0_4mRNA1cds527/439144570__:0.92078205301668014648)40:0.21444814008989548926,Cvel_20619___organism=Chromera_velia_CCMP2878___product=Malto-oligosyltrehalose_trehalohydrolase_putative___location=Cvel_scaffold1867_1208-11669_-___length=662___sequence_SO=supercontig___SO=protein_coding_:0.93952895504658817671)20:0.18331222947529093870)70:0.45890345971164558936)80:0.33542172384911861371)100:1.10060499239467945998)30:0.42698796779576941862)100:2.44706655918227600210,sp_Q9M0S5_ISOA3_ARATH_Isoamylase_3_chloroplastic_OS=Arabidopsis_thaliana_GN=ISA3_PE=2_SV=2_:0.91892546724791945856);
编辑: 评论后我编辑标题,将"png"替换为"svg",隐藏轴 并添加图片。
你快到了!您将需要指定 label_func
参数,它是一个函数。像这样:
from Bio import Phylo
import pylab
def get_label(leaf):
return leaf.name
f = 'path/to/my/file'
tree = Phylo.read(f, 'newick')
tree.ladderize()
Phylo.draw(tree, label_func=get_label, do_show=False)
pylab.axis('off')
pylab.savefig('tree2.svg',format='svg', bbox_inches='tight', dpi=300)