如何只显示前 20 个条目
How to show only the first 20 entries
这里的 Python 代码得到了我想要的输出。但是,我需要帮助将结果限制在前 20 行。
输入示例如下,
gi|170079688|ref|YP_001729008.1| bifunctional riboflavin kinase/FMN adenylyltransferase [Escherichia coli str. K-12 substr. DH10B]
MKLIRGIHNLSQAPQEGCVLTIGNFDGVHRGHRALLQGLQEEGRKRNLPVMVMLFEPQPLELFATDKAPA
RLTRLREKLRYLAECGVDYVLCVRFDRRFAALTAQNFISDLLVKHLRVKFLAVGDDFRFGAGREGDFLLL
QKAGMEYGFDITSTQTFCEGGVRISSTAVRQALADDNLALAESLLGHPFAISGRVVHGDELGRTIGFPTA
NVPLRRQVSPVKGVYAVEVLGLGEKPLPGVANIGTRPTVAGIRQQLEVHLLDVAMDLYGRHIQVVLRKKI
RNEQRFASLDELKAQIARDELTAREFFGLTKPA
gi|170079689|ref|YP_001729009.1| isoleucyl-tRNA synthetase [Escherichia coli str. K-12 substr. DH10B]
MSDYKSTLNLPETGFPMRGDLAKREPGMLARWTDDDLYGIIRAAKKGKKTFILHDGPPYANGSIHIGHSV
NKILKDIIVKSKGLSGYDSPYVPGWDCHGLPIELKVEQEYGKPGEKFTAAEFRAKCREYAATQVDGQRKD
FIRLGVLGDWSHPYLTMDFKTEANIIRALGKIIGNGHLHKGAKPVHWCVDCRSALAEAEVEYYDKTSPSI
DVAFQAVDQDALKAKFAVSNVNGPISLVIWTTTPWTLPANRAISIAPDFDYALVQIDGQAVILAKDLVES
VMQRIGVTDYTILGTVKGAELELLRFTHPFMGFDVPAILGDHVTLDAGTGAVHTAPGHGPDDYVIGQKYG
LETANPVGPDGTYLPGTYPTLDGVNVFKANDIVVALLQEKGALLHVEKMQHSYPCCWRHKTPIIFRATPQ
WFVSMDQKGLRAQSLKEIKGVQWIPDWGQARIESMVANRPDWCISRQRTWGVPMSLFVHKDTEELHPRTL
ELMEEVAKRVEVDGIQAWWDLDAKEILGDEADQYVKVPDTLDVWFDSGSTHSSVVDVRPEFAGHAADMYL
EGSDQHRGWFMSSLMISTAMKGKAPYRQVLTHGFTVDGQGRKMSKSIGNTVSPQDVMNKLGADILRLWVA
STDYTGEMAVSDEILKRAADSYRRIRNTARFLLANLNGFDPAKDMVKPEEMVVLDRWAVGCAKAAQEDIL
KAYEAYDFHEVVQRLMRFCSVEMGSFYLDIIKDRQYTAKADSVARRSCQTALYHIAEALVRWMAPILSFT
ADEVWGYLPGEREKYVFTGEWYEGLFGLADSEAMNDAFWDELLKVRGEVNKVIEQARADKKVGGSLEAAV
TLYAEPELSAKLTALGDELRFVLLTSGATVADYNDAPADAQQSEVLKGLKVALSKAEGEKCPRCWHYTQD
VGKVAEHAEICGRCVSNVAGDGEKRKFA
gi|170079690|ref|YP_001729010.1| lipoprotein signal peptidase [Escherichia coli str. K-12 substr. DH10B]
MSQSICSTGLRWLWLVVVVLIIDLGSKYLILQNFALGDTVPLFPSLNLHYARNYGAAFSFLADSGGWQRW
FFAGIAIGISVILAVMMYRSKATQKLNNIAYALIIGGALGNLFDRLWHGFVVDMIDFYVGDWHFATFNLA
DTAICVGAALIVLEGFLPSRAKKQ
import re
id = None
header = None
seq = ''
a_file = open('e_coli.faa')
for line in a_file:
m = re.match(">(\S+)\s+(.+)", line.rstrip())
if m:
if id is not None:
print("{0} length:{1} {2}".format(id, len(seq),header))
id, header = m.groups()
seq = ''
else:
seq += line.rstrip()
在最顶部,添加 c = 0
。然后,改变
print("{0} length:{1} {2}".format(id, len(seq),header))
到
if c < 10:
print("{0} length:{1} {2}".format(id, len(seq),header))
c += 1
经过一些调整的结果:
import re
id = None
header = None
seq = ''
with open('e_coli.faa') as a_file:
for line in a_file:
m = re.match(">(\S+)\s+(.+)", line.rstrip())
if m:
if id and c < 20:
print("{0} length:{1} {2}".format(id, len(seq),header))
c += 1
id, header = m.groups()
seq = ''
else:
seq += line.rstrip()
读取文件的前 20 行。
你可以使用 readlines()
:
而不是:
for line in a_file:
使用:
for line in a_file.readlines()[:20]:
这里的 Python 代码得到了我想要的输出。但是,我需要帮助将结果限制在前 20 行。
输入示例如下,
gi|170079688|ref|YP_001729008.1| bifunctional riboflavin kinase/FMN adenylyltransferase [Escherichia coli str. K-12 substr. DH10B] MKLIRGIHNLSQAPQEGCVLTIGNFDGVHRGHRALLQGLQEEGRKRNLPVMVMLFEPQPLELFATDKAPA RLTRLREKLRYLAECGVDYVLCVRFDRRFAALTAQNFISDLLVKHLRVKFLAVGDDFRFGAGREGDFLLL QKAGMEYGFDITSTQTFCEGGVRISSTAVRQALADDNLALAESLLGHPFAISGRVVHGDELGRTIGFPTA NVPLRRQVSPVKGVYAVEVLGLGEKPLPGVANIGTRPTVAGIRQQLEVHLLDVAMDLYGRHIQVVLRKKI RNEQRFASLDELKAQIARDELTAREFFGLTKPA gi|170079689|ref|YP_001729009.1| isoleucyl-tRNA synthetase [Escherichia coli str. K-12 substr. DH10B] MSDYKSTLNLPETGFPMRGDLAKREPGMLARWTDDDLYGIIRAAKKGKKTFILHDGPPYANGSIHIGHSV NKILKDIIVKSKGLSGYDSPYVPGWDCHGLPIELKVEQEYGKPGEKFTAAEFRAKCREYAATQVDGQRKD FIRLGVLGDWSHPYLTMDFKTEANIIRALGKIIGNGHLHKGAKPVHWCVDCRSALAEAEVEYYDKTSPSI DVAFQAVDQDALKAKFAVSNVNGPISLVIWTTTPWTLPANRAISIAPDFDYALVQIDGQAVILAKDLVES VMQRIGVTDYTILGTVKGAELELLRFTHPFMGFDVPAILGDHVTLDAGTGAVHTAPGHGPDDYVIGQKYG LETANPVGPDGTYLPGTYPTLDGVNVFKANDIVVALLQEKGALLHVEKMQHSYPCCWRHKTPIIFRATPQ WFVSMDQKGLRAQSLKEIKGVQWIPDWGQARIESMVANRPDWCISRQRTWGVPMSLFVHKDTEELHPRTL ELMEEVAKRVEVDGIQAWWDLDAKEILGDEADQYVKVPDTLDVWFDSGSTHSSVVDVRPEFAGHAADMYL EGSDQHRGWFMSSLMISTAMKGKAPYRQVLTHGFTVDGQGRKMSKSIGNTVSPQDVMNKLGADILRLWVA STDYTGEMAVSDEILKRAADSYRRIRNTARFLLANLNGFDPAKDMVKPEEMVVLDRWAVGCAKAAQEDIL KAYEAYDFHEVVQRLMRFCSVEMGSFYLDIIKDRQYTAKADSVARRSCQTALYHIAEALVRWMAPILSFT ADEVWGYLPGEREKYVFTGEWYEGLFGLADSEAMNDAFWDELLKVRGEVNKVIEQARADKKVGGSLEAAV TLYAEPELSAKLTALGDELRFVLLTSGATVADYNDAPADAQQSEVLKGLKVALSKAEGEKCPRCWHYTQD VGKVAEHAEICGRCVSNVAGDGEKRKFA gi|170079690|ref|YP_001729010.1| lipoprotein signal peptidase [Escherichia coli str. K-12 substr. DH10B] MSQSICSTGLRWLWLVVVVLIIDLGSKYLILQNFALGDTVPLFPSLNLHYARNYGAAFSFLADSGGWQRW FFAGIAIGISVILAVMMYRSKATQKLNNIAYALIIGGALGNLFDRLWHGFVVDMIDFYVGDWHFATFNLA DTAICVGAALIVLEGFLPSRAKKQ
import re
id = None
header = None
seq = ''
a_file = open('e_coli.faa')
for line in a_file:
m = re.match(">(\S+)\s+(.+)", line.rstrip())
if m:
if id is not None:
print("{0} length:{1} {2}".format(id, len(seq),header))
id, header = m.groups()
seq = ''
else:
seq += line.rstrip()
在最顶部,添加 c = 0
。然后,改变
print("{0} length:{1} {2}".format(id, len(seq),header))
到
if c < 10:
print("{0} length:{1} {2}".format(id, len(seq),header))
c += 1
经过一些调整的结果:
import re
id = None
header = None
seq = ''
with open('e_coli.faa') as a_file:
for line in a_file:
m = re.match(">(\S+)\s+(.+)", line.rstrip())
if m:
if id and c < 20:
print("{0} length:{1} {2}".format(id, len(seq),header))
c += 1
id, header = m.groups()
seq = ''
else:
seq += line.rstrip()
读取文件的前 20 行。
你可以使用 readlines()
:
而不是:
for line in a_file:
使用:
for line in a_file.readlines()[:20]: