如何在不使用 Biopython 包的情况下写入 python
how to write in python without using Biopython package
我愿意写一个程序,提取"Region"类型的特征对应的氨基酸序列作为单独的Fasta文件,列出"Site"的氨基酸和位置site_type="phosphorylation"。
没有使用 Biopython 包。
(我已经有 biopython ,它做同样的事情)
文件如下。
LOCUS NP_005219 1210 aa linear PRI 15-MAR-2015
DEFINITION epidermal growth factor receptor isoform a precursor [Homo
sapiens].
ACCESSION NP_005219
VERSION NP_005219.2 GI:29725609
DBSOURCE REFSEQ: accession NM_005228.3
KEYWORDS RefSeq.
FEATURES Location/Qualifiers
source 1..1210
/organism="Homo sapiens"
/db_xref="taxon:9606"
/chromosome="7"
/map="7p12"
Protein 1..1210
/product="epidermal growth factor receptor isoform a
precursor"
/EC_number="2.7.10.1"
/note="avian erythroblastic leukemia viral (v-erb-b)
oncogene homolog; cell proliferation-inducing protein 61;
cell growth inhibiting protein 40; proto-oncogene
c-ErbB-1; receptor tyrosine-protein kinase erbB-1"
sig_peptide 1..24
/inference="COORDINATES: ab initio prediction:SignalP:4.0"
/calculated_mol_wt=2283
mat_peptide 25..1210
/product="epidermal growth factor receptor isoform a"
/calculated_mol_wt=132013
Region 57..168
/region_name="Recep_L_domain"
/note="Receptor L domain; pfam01030"
/db_xref="CDD:250307"
Region 75..300
/region_name="Approximate"
/experiment="experimental evidence, no additional details
recorded"
/note="propagated from UniProtKB/Swiss-Prot (P00533.2)"
Region 185..337
/region_name="Furin-like"
/note="Furin-like cysteine rich region; pfam00757"
/db_xref="CDD:250112"
Site 229
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/note="Phosphoserine. {ECO:0000269|PubMed:21487020};
propagated from UniProtKB/Swiss-Prot (P00533.2)"
Region 231..274
/region_name="FU"
/note="Furin-like repeats. Cysteine rich region. Exact
function of the domain is not known. Furin is a
serine-kinase dependent proprotein processor. Other
members of this family include endoproteases and cell
surface receptors; cd00064"
/db_xref="CDD:238021"
Region 361..481
/region_name="Recep_L_domain"
/note="Receptor L domain; pfam01030"
/db_xref="CDD:250307"
Region 390..600
/region_name="Approximate"
/experiment="experimental evidence, no additional details
recorded"
/note="propagated from UniProtKB/Swiss-Prot (P00533.2)"
Region 505..637
/region_name="GF_recep_IV"
/note="Growth factor receptor domain IV; pfam14843"
/db_xref="CDD:258980"
Region 506..559
/region_name="FU"
/note="Furin-like repeats. Cysteine rich region. Exact
function of the domain is not known. Furin is a
serine-kinase dependent proprotein processor. Other
members of this family include endoproteases and cell
surface receptors; cd00064"
/db_xref="CDD:238021"
Region 558..>598
/region_name="FU"
/note="Furin-like repeats. Cysteine rich region. Exact
function of the domain is not known. Furin is a
serine-kinase dependent proprotein processor. Other
members of this family include endoproteases and cell
surface receptors; cd00064"
/db_xref="CDD:238021"
Region 634..677
/region_name="TM_ErbB1"
/note="Transmembrane domain of Epidermal Growth Factor
Receptor or ErbB1, a Protein Tyrosine Kinase; cd12093"
/db_xref="CDD:213054"
Site order(644..646,648..653,656..657)
/site_type="other"
/note="heterodimer interface [polypeptide binding]"
/db_xref="CDD:213054"
Site 646..668
/site_type="transmembrane region"
/experiment="experimental evidence, no additional details
recorded"
/note="propagated from UniProtKB/Swiss-Prot (P00533.2)"
Site 678
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/note="Phosphothreonine, by PKC and PKD/PRKD1.
{ECO:0000269|PubMed:10523301}; propagated from
UniProtKB/Swiss-Prot (P00533.2)"
Region 688..704
/region_name="Important for dimerization, phosphorylation
and activation"
/experiment="experimental evidence, no additional details
recorded"
/note="propagated from UniProtKB/Swiss-Prot (P00533.2)"
Site 693
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/note="Phosphothreonine, by PKD/PRKD1.
{ECO:0000269|PubMed:10523301, ECO:0000269|PubMed:16083266,
ECO:0000269|PubMed:18691976, ECO:0000269|PubMed:20068231,
ECO:0000269|PubMed:3138233}; propagated from
UniProtKB/Swiss-Prot (P00533.2)"
Site 695
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/note="Phosphoserine. {ECO:0000269|PubMed:18691976,
ECO:0000269|PubMed:3138233}; propagated from
UniProtKB/Swiss-Prot (P00533.2)"
Region 704..1016
/region_name="PTKc_EGFR"
/note="Catalytic domain of the Protein Tyrosine Kinase,
Epidermal Growth Factor Receptor; cd05108"
/db_xref="CDD:270683"
Region 712..968
/region_name="Pkinase_Tyr"
/note="Protein tyrosine kinase; pfam07714"
/db_xref="CDD:254379"
Site order(715..717,728..730,794..795,797,804..805,1009..1010)
/site_type="other"
/note="dimer interface [polypeptide binding]"
/db_xref="CDD:270683"
Site order(718..719,722..723,745,791,793,797,841..842,855,
876..880,885,889)
/site_type="active"
/db_xref="CDD:270683"
Site order(718..719,726,743,745,766,790..791,793,841..842,844,
855)
/site_type="other"
/note="ATP binding site [chemical binding]"
/db_xref="CDD:270683"
Site 854..879
/site_type="other"
/note="activation loop (A-loop)"
/db_xref="CDD:270683"
Site order(876..880,885,889)
/site_type="other"
/note="polypeptide substrate binding site [polypeptide
binding]"
/db_xref="CDD:270683"
Site 991
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/note="Phosphoserine. {ECO:0000269|PubMed:16083266,
ECO:0000269|PubMed:18669648, ECO:0000269|PubMed:20068231};
propagated from UniProtKB/Swiss-Prot (P00533.2)"
Site 995
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/note="Phosphoserine. {ECO:0000269|PubMed:18669648};
propagated from UniProtKB/Swiss-Prot (P00533.2)"
Site 998
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/note="Phosphotyrosine, by autocatalysis.
{ECO:0000269|PubMed:18669648,
ECO:0000269|PubMed:19563760}; propagated from
UniProtKB/Swiss-Prot (P00533.2)"
Site 1016
/site_type="other"
/experiment="experimental evidence, no additional details
recorded"
/note="Important for interaction with PIK3C2B; propagated
from UniProtKB/Swiss-Prot (P00533.2)"
Site 1016
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/note="Phosphotyrosine, by autocatalysis.
{ECO:0000269|PubMed:19563760}; propagated from
UniProtKB/Swiss-Prot (P00533.2)"
Site 1026
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/note="Phosphoserine. {ECO:0000269|PubMed:16083266};
propagated from UniProtKB/Swiss-Prot (P00533.2)"
Site 1039
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/note="Phosphoserine. {ECO:0000269|PubMed:18669648};
propagated from UniProtKB/Swiss-Prot (P00533.2)"
Site 1041
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/note="Phosphothreonine. {ECO:0000269|PubMed:18669648};
propagated from UniProtKB/Swiss-Prot (P00533.2)"
Site 1042
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/note="Phosphoserine. {ECO:0000269|PubMed:18669648};
propagated from UniProtKB/Swiss-Prot (P00533.2)"
Site 1064
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/note="Phosphoserine. {ECO:0000269|PubMed:18669648,
ECO:0000269|PubMed:18691976, ECO:0000269|PubMed:20068231};
propagated from UniProtKB/Swiss-Prot (P00533.2)"
Site 1069
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/note="Phosphotyrosine. {ECO:0000305|PubMed:22888118};
propagated from UniProtKB/Swiss-Prot (P00533.2)"
Site 1070
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/note="Phosphoserine. {ECO:0000269|PubMed:3138233};
propagated from UniProtKB/Swiss-Prot (P00533.2)"
Site 1071
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/note="Phosphoserine. {ECO:0000269|PubMed:3138233};
propagated from UniProtKB/Swiss-Prot (P00533.2)"
Site 1081
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/note="Phosphoserine. {ECO:0000269|PubMed:18691976};
propagated from UniProtKB/Swiss-Prot (P00533.2)"
Site 1092
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/note="Phosphotyrosine, by autocatalysis.
{ECO:0000269|PubMed:12873986}; propagated from
UniProtKB/Swiss-Prot (P00533.2)"
Site 1110
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/note="Phosphotyrosine, by autocatalysis.
{ECO:0000269|PubMed:12873986, ECO:0000269|PubMed:2543678};
propagated from UniProtKB/Swiss-Prot (P00533.2)"
Site 1166
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/note="Phosphoserine. {ECO:0000269|PubMed:18669648,
ECO:0000269|PubMed:18691976}; propagated from
UniProtKB/Swiss-Prot (P00533.2)"
Site 1172
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/note="Phosphotyrosine, by autocatalysis.
{ECO:0000269|PubMed:17081983}; propagated from
UniProtKB/Swiss-Prot (P00533.2)"
Site 1197
/site_type="phosphorylation"
/experiment="experimental evidence, no additional details
recorded"
/note="Phosphotyrosine, by autocatalysis.
{ECO:0000269|PubMed:17081983, ECO:0000269|PubMed:18691976,
ECO:0000269|PubMed:19563760, ECO:0000269|PubMed:19836242,
ECO:0000269|PubMed:20068231}; propagated from
UniProtKB/Swiss-Prot (P00533.2)"
Site 1199
/site_type="methylation"
/experiment="experimental evidence, no additional details
recorded"
/note="Omega-N-methylarginine.
{ECO:0000269|PubMed:21258366}; propagated from
UniProtKB/Swiss-Prot (P00533.2)"
CDS 1..1210
/gene="EGFR"
/gene_synonym="ERBB; ERBB1; HER1; mENA; NISBD2; PIG61"
/coded_by="NM_005228.3:247..3879"
/note="isoform a precursor is encoded by transcript
variant 1"
/db_xref="CCDS:CCDS5514.1"
/db_xref="GeneID:1956"
/db_xref="HGNC:HGNC:3236"
/db_xref="MIM:131550"
ORIGIN
1 mrpsgtagaa llallaalcp asraleekkv cqgtsnkltq lgtfedhfls lqrmfnncev
61 vlgnleityv qrnydlsflk tiqevagyvl ialntverip lenlqiirgn myyensyala
121 vlsnydankt glkelpmrnl qeilhgavrf snnpalcnve siqwrdivss dflsnmsmdf
181 qnhlgscqkc dpscpngscw gageencqkl tkiicaqqcs grcrgkspsd cchnqcaagc
241 tgpresdclv crkfrdeatc kdtcpplmly npttyqmdvn pegkysfgat cvkkcprnyv
301 vtdhgscvra cgadsyemee dgvrkckkce gpcrkvcngi gigefkdsls inatnikhfk
361 nctsisgdlh ilpvafrgds fthtppldpq eldilktvke itgflliqaw penrtdlhaf
421 enleiirgrt kqhgqfslav vslnitslgl rslkeisdgd viisgnknlc yantinwkkl
481 fgtsgqktki isnrgensck atgqvchalc spegcwgpep rdcvscrnvs rgrecvdkcn
541 llegeprefv enseciqchp eclpqamnit ctgrgpdnci qcahyidgph cvktcpagvm
601 genntlvwky adaghvchlc hpnctygctg pglegcptng pkipsiatgm vgalllllvv
661 algiglfmrr rhivrkrtlr rllqerelve pltpsgeapn qallrilket efkkikvlgs
721 gafgtvykgl wipegekvki pvaikelrea tspkankeil deayvmasvd nphvcrllgi
781 cltstvqlit qlmpfgclld yvrehkdnig sqyllnwcvq iakgmnyled rrlvhrdlaa
841 rnvlvktpqh vkitdfglak llgaeekeyh aeggkvpikw malesilhri ythqsdvwsy
901 gvtvwelmtf gskpydgipa seissilekg erlpqppict idvymimvkc wmidadsrpk
961 freliiefsk mardpqrylv iqgdermhlp sptdsnfyra lmdeedmddv vdadeylipq
1021 qgffsspsts rtpllsslsa tsnnstvaci drnglqscpi kedsflqrys sdptgalted
1081 siddtflpvp eyinqsvpkr pagsvqnpvy hnqplnpaps rdphyqdphs tavgnpeyln
1141 tvqptcvnst fdspahwaqk gshqisldnp dyqqdffpke akpngifkgs taenaeylrv
1201 apqssefiga
//
我推荐使用biopython
from Bio import SeqIO
file = "file.gb"
#gb = next(SeqIO.parse(open(file), "genbank")) in python 3
gb = SeqIO.parse(open(file), "gb").next()
phosphorylation_list = [f for f in gb.features if f.type=="Site" and
"phosphorylation" in f.qualifiers['site_type']]
for f in phosphorylation_list:
print((int(f.location.start), int(f.location.end)))
你明白了,
(228, 229)
(677, 678)
(692, 693)
(694, 695)
(990, 991)
(994, 995)
(997, 998)
(1015, 1016)
(1025, 1026)
(1038, 1039)
(1040, 1041)
(1041, 1042)
(1063, 1064)
(1068, 1069)
(1069, 1070)
(1070, 1071)
(1080, 1081)
(1091, 1092)
(1109, 1110)
(1165, 1166)
(1171, 1172)
(1196, 1197)
我愿意写一个程序,提取"Region"类型的特征对应的氨基酸序列作为单独的Fasta文件,列出"Site"的氨基酸和位置site_type="phosphorylation"。
没有使用 Biopython 包。
(我已经有 biopython
文件如下。
LOCUS NP_005219 1210 aa linear PRI 15-MAR-2015 DEFINITION epidermal growth factor receptor isoform a precursor [Homo sapiens]. ACCESSION NP_005219 VERSION NP_005219.2 GI:29725609 DBSOURCE REFSEQ: accession NM_005228.3 KEYWORDS RefSeq. FEATURES Location/Qualifiers source 1..1210 /organism="Homo sapiens" /db_xref="taxon:9606" /chromosome="7" /map="7p12" Protein 1..1210 /product="epidermal growth factor receptor isoform a precursor" /EC_number="2.7.10.1" /note="avian erythroblastic leukemia viral (v-erb-b) oncogene homolog; cell proliferation-inducing protein 61; cell growth inhibiting protein 40; proto-oncogene c-ErbB-1; receptor tyrosine-protein kinase erbB-1" sig_peptide 1..24 /inference="COORDINATES: ab initio prediction:SignalP:4.0" /calculated_mol_wt=2283 mat_peptide 25..1210 /product="epidermal growth factor receptor isoform a" /calculated_mol_wt=132013 Region 57..168 /region_name="Recep_L_domain" /note="Receptor L domain; pfam01030" /db_xref="CDD:250307" Region 75..300 /region_name="Approximate" /experiment="experimental evidence, no additional details recorded" /note="propagated from UniProtKB/Swiss-Prot (P00533.2)" Region 185..337 /region_name="Furin-like" /note="Furin-like cysteine rich region; pfam00757" /db_xref="CDD:250112" Site 229 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:21487020}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Region 231..274 /region_name="FU" /note="Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors; cd00064" /db_xref="CDD:238021" Region 361..481 /region_name="Recep_L_domain" /note="Receptor L domain; pfam01030" /db_xref="CDD:250307" Region 390..600 /region_name="Approximate" /experiment="experimental evidence, no additional details recorded" /note="propagated from UniProtKB/Swiss-Prot (P00533.2)" Region 505..637 /region_name="GF_recep_IV" /note="Growth factor receptor domain IV; pfam14843" /db_xref="CDD:258980" Region 506..559 /region_name="FU" /note="Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors; cd00064" /db_xref="CDD:238021" Region 558..>598 /region_name="FU" /note="Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors; cd00064" /db_xref="CDD:238021" Region 634..677 /region_name="TM_ErbB1" /note="Transmembrane domain of Epidermal Growth Factor Receptor or ErbB1, a Protein Tyrosine Kinase; cd12093" /db_xref="CDD:213054" Site order(644..646,648..653,656..657) /site_type="other" /note="heterodimer interface [polypeptide binding]" /db_xref="CDD:213054" Site 646..668 /site_type="transmembrane region" /experiment="experimental evidence, no additional details recorded" /note="propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 678 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphothreonine, by PKC and PKD/PRKD1. {ECO:0000269|PubMed:10523301}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Region 688..704 /region_name="Important for dimerization, phosphorylation and activation" /experiment="experimental evidence, no additional details recorded" /note="propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 693 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphothreonine, by PKD/PRKD1. {ECO:0000269|PubMed:10523301, ECO:0000269|PubMed:16083266, ECO:0000269|PubMed:18691976, ECO:0000269|PubMed:20068231, ECO:0000269|PubMed:3138233}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 695 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:18691976, ECO:0000269|PubMed:3138233}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Region 704..1016 /region_name="PTKc_EGFR" /note="Catalytic domain of the Protein Tyrosine Kinase, Epidermal Growth Factor Receptor; cd05108" /db_xref="CDD:270683" Region 712..968 /region_name="Pkinase_Tyr" /note="Protein tyrosine kinase; pfam07714" /db_xref="CDD:254379" Site order(715..717,728..730,794..795,797,804..805,1009..1010) /site_type="other" /note="dimer interface [polypeptide binding]" /db_xref="CDD:270683" Site order(718..719,722..723,745,791,793,797,841..842,855, 876..880,885,889) /site_type="active" /db_xref="CDD:270683" Site order(718..719,726,743,745,766,790..791,793,841..842,844, 855) /site_type="other" /note="ATP binding site [chemical binding]" /db_xref="CDD:270683" Site 854..879 /site_type="other" /note="activation loop (A-loop)" /db_xref="CDD:270683" Site order(876..880,885,889) /site_type="other" /note="polypeptide substrate binding site [polypeptide binding]" /db_xref="CDD:270683" Site 991 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:16083266, ECO:0000269|PubMed:18669648, ECO:0000269|PubMed:20068231}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 995 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:18669648}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 998 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphotyrosine, by autocatalysis. {ECO:0000269|PubMed:18669648, ECO:0000269|PubMed:19563760}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1016 /site_type="other" /experiment="experimental evidence, no additional details recorded" /note="Important for interaction with PIK3C2B; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1016 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphotyrosine, by autocatalysis. {ECO:0000269|PubMed:19563760}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1026 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:16083266}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1039 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:18669648}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1041 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphothreonine. {ECO:0000269|PubMed:18669648}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1042 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:18669648}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1064 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:18669648, ECO:0000269|PubMed:18691976, ECO:0000269|PubMed:20068231}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1069 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphotyrosine. {ECO:0000305|PubMed:22888118}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1070 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:3138233}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1071 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:3138233}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1081 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:18691976}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1092 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphotyrosine, by autocatalysis. {ECO:0000269|PubMed:12873986}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1110 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphotyrosine, by autocatalysis. {ECO:0000269|PubMed:12873986, ECO:0000269|PubMed:2543678}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1166 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphoserine. {ECO:0000269|PubMed:18669648, ECO:0000269|PubMed:18691976}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1172 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphotyrosine, by autocatalysis. {ECO:0000269|PubMed:17081983}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1197 /site_type="phosphorylation" /experiment="experimental evidence, no additional details recorded" /note="Phosphotyrosine, by autocatalysis. {ECO:0000269|PubMed:17081983, ECO:0000269|PubMed:18691976, ECO:0000269|PubMed:19563760, ECO:0000269|PubMed:19836242, ECO:0000269|PubMed:20068231}; propagated from UniProtKB/Swiss-Prot (P00533.2)" Site 1199 /site_type="methylation" /experiment="experimental evidence, no additional details recorded" /note="Omega-N-methylarginine. {ECO:0000269|PubMed:21258366}; propagated from UniProtKB/Swiss-Prot (P00533.2)" CDS 1..1210 /gene="EGFR" /gene_synonym="ERBB; ERBB1; HER1; mENA; NISBD2; PIG61" /coded_by="NM_005228.3:247..3879" /note="isoform a precursor is encoded by transcript variant 1" /db_xref="CCDS:CCDS5514.1" /db_xref="GeneID:1956" /db_xref="HGNC:HGNC:3236" /db_xref="MIM:131550" ORIGIN 1 mrpsgtagaa llallaalcp asraleekkv cqgtsnkltq lgtfedhfls lqrmfnncev 61 vlgnleityv qrnydlsflk tiqevagyvl ialntverip lenlqiirgn myyensyala 121 vlsnydankt glkelpmrnl qeilhgavrf snnpalcnve siqwrdivss dflsnmsmdf 181 qnhlgscqkc dpscpngscw gageencqkl tkiicaqqcs grcrgkspsd cchnqcaagc 241 tgpresdclv crkfrdeatc kdtcpplmly npttyqmdvn pegkysfgat cvkkcprnyv 301 vtdhgscvra cgadsyemee dgvrkckkce gpcrkvcngi gigefkdsls inatnikhfk 361 nctsisgdlh ilpvafrgds fthtppldpq eldilktvke itgflliqaw penrtdlhaf 421 enleiirgrt kqhgqfslav vslnitslgl rslkeisdgd viisgnknlc yantinwkkl 481 fgtsgqktki isnrgensck atgqvchalc spegcwgpep rdcvscrnvs rgrecvdkcn 541 llegeprefv enseciqchp eclpqamnit ctgrgpdnci qcahyidgph cvktcpagvm 601 genntlvwky adaghvchlc hpnctygctg pglegcptng pkipsiatgm vgalllllvv 661 algiglfmrr rhivrkrtlr rllqerelve pltpsgeapn qallrilket efkkikvlgs 721 gafgtvykgl wipegekvki pvaikelrea tspkankeil deayvmasvd nphvcrllgi 781 cltstvqlit qlmpfgclld yvrehkdnig sqyllnwcvq iakgmnyled rrlvhrdlaa 841 rnvlvktpqh vkitdfglak llgaeekeyh aeggkvpikw malesilhri ythqsdvwsy 901 gvtvwelmtf gskpydgipa seissilekg erlpqppict idvymimvkc wmidadsrpk 961 freliiefsk mardpqrylv iqgdermhlp sptdsnfyra lmdeedmddv vdadeylipq 1021 qgffsspsts rtpllsslsa tsnnstvaci drnglqscpi kedsflqrys sdptgalted 1081 siddtflpvp eyinqsvpkr pagsvqnpvy hnqplnpaps rdphyqdphs tavgnpeyln 1141 tvqptcvnst fdspahwaqk gshqisldnp dyqqdffpke akpngifkgs taenaeylrv 1201 apqssefiga //
我推荐使用biopython
from Bio import SeqIO
file = "file.gb"
#gb = next(SeqIO.parse(open(file), "genbank")) in python 3
gb = SeqIO.parse(open(file), "gb").next()
phosphorylation_list = [f for f in gb.features if f.type=="Site" and
"phosphorylation" in f.qualifiers['site_type']]
for f in phosphorylation_list:
print((int(f.location.start), int(f.location.end)))
你明白了,
(228, 229) (677, 678) (692, 693) (694, 695) (990, 991) (994, 995) (997, 998) (1015, 1016) (1025, 1026) (1038, 1039) (1040, 1041) (1041, 1042) (1063, 1064) (1068, 1069) (1069, 1070) (1070, 1071) (1080, 1081) (1091, 1092) (1109, 1110) (1165, 1166) (1171, 1172) (1196, 1197)