如何在 Python 中打印出文件中的某些行
How to print out certain lines in a file in Python
我需要一些帮助来弄清楚如何在 .txt 文件中只打印给定行数。
我创建了一个带有 2 个输入参数的函数文件 (x,y),第一个 'x' 是文件,第二个 'y' 决定它有多少行即将打印。
示例:
假设文件名为 x.txt,文件中的内容为:
>Sentence 1
I like playing games
>Sentence 2
I like jumping around
>Sentence 3
I like dancing
>Sentence 4
I like swimming
>Sentence 5
I like riding my bike
我想用这些内容做的是让它读取然后在我调用 file("x.txt",3) 时打印出文件中的句子,所以它只会打印第一个3 行,如本示例输出中所示:
'I like playing games'
'I like jumping around'
'I like dancing'
这是我到目前为止所做的:
def file(x, y):
file = open(x, 'r')
g = list(range(y))
h = [a for i, a in enumerate(file) if i in g]
return " ' ".join(h)
我无法弄清楚如何让程序打印用户输入的行数,但到目前为止,当我 运行 运行 程序时,这就是我得到的:
>Sentence 1
' I like playing games
' >Sentence 2
我只想打印句子,不想打印">Sentence #"部分。
有人能帮我解决这个问题吗?谢谢你!
一个简单的原生 Python 解决方案,我假设不以 >
开头的行是 'sentence' 行:
from itertools import islice
def extract_lines(in_file, num):
with open(in_file) as in_f:
gen = (line for line in in_f if not line.startswith('>'))
return '\n'.join(islice(gen, num))
但这实际上是 FASTA format (now it is clear this is true) then I suggest using BioPython 而不是:
from Bio import SeqIO
from itertools import islice
def extract_lines(in_file, num):
with open(in_file) as in_f:
gen = (record.seq for record in SeqIO.parse(in_f, 'fasta'))
return list(islice(gen, num))
@Chris_Rands的回答很好,但是由于你在评论中要求没有导入的解决方案,这里有一种可能性:
def extract_lines(in_file, num):
"""This function generates the first *num* non-header lines
from fasta-formatted file *in_file*."""
nb_outputted_lines = 0
with open(in_file, "r") as fasta:
for line in fasta:
if nb_outputted_lines >= num:
break # This interrupts the for loop
if line[0] != ">":
yield line.strip() # strip the trailing '\n'
nb_outputted_lines += 1
使用方法:
for line in extract_lines("x.txt", 3):
print(line)
# If you want the quotes:
#print("'%s'" % line)
# Or (python 3.6+):
#print(f"'{line}'")
我需要一些帮助来弄清楚如何在 .txt 文件中只打印给定行数。
我创建了一个带有 2 个输入参数的函数文件 (x,y),第一个 'x' 是文件,第二个 'y' 决定它有多少行即将打印。
示例: 假设文件名为 x.txt,文件中的内容为:
>Sentence 1
I like playing games
>Sentence 2
I like jumping around
>Sentence 3
I like dancing
>Sentence 4
I like swimming
>Sentence 5
I like riding my bike
我想用这些内容做的是让它读取然后在我调用 file("x.txt",3) 时打印出文件中的句子,所以它只会打印第一个3 行,如本示例输出中所示:
'I like playing games'
'I like jumping around'
'I like dancing'
这是我到目前为止所做的:
def file(x, y):
file = open(x, 'r')
g = list(range(y))
h = [a for i, a in enumerate(file) if i in g]
return " ' ".join(h)
我无法弄清楚如何让程序打印用户输入的行数,但到目前为止,当我 运行 运行 程序时,这就是我得到的:
>Sentence 1
' I like playing games
' >Sentence 2
我只想打印句子,不想打印">Sentence #"部分。
有人能帮我解决这个问题吗?谢谢你!
一个简单的原生 Python 解决方案,我假设不以 >
开头的行是 'sentence' 行:
from itertools import islice
def extract_lines(in_file, num):
with open(in_file) as in_f:
gen = (line for line in in_f if not line.startswith('>'))
return '\n'.join(islice(gen, num))
但这实际上是 FASTA format (now it is clear this is true) then I suggest using BioPython 而不是:
from Bio import SeqIO
from itertools import islice
def extract_lines(in_file, num):
with open(in_file) as in_f:
gen = (record.seq for record in SeqIO.parse(in_f, 'fasta'))
return list(islice(gen, num))
@Chris_Rands的回答很好,但是由于你在评论中要求没有导入的解决方案,这里有一种可能性:
def extract_lines(in_file, num):
"""This function generates the first *num* non-header lines
from fasta-formatted file *in_file*."""
nb_outputted_lines = 0
with open(in_file, "r") as fasta:
for line in fasta:
if nb_outputted_lines >= num:
break # This interrupts the for loop
if line[0] != ">":
yield line.strip() # strip the trailing '\n'
nb_outputted_lines += 1
使用方法:
for line in extract_lines("x.txt", 3):
print(line)
# If you want the quotes:
#print("'%s'" % line)
# Or (python 3.6+):
#print(f"'{line}'")