计算多个 fasta 文件中的出现适配器
Count occurence adaptor in multiple fasta file
我在一个文件夹中有 30 个 fastq 文件,我想知道我可以在这些文件中的哪个文件中找到特定的适配器(这样我就可以弄清楚它实际上是哪个样本)。
我写了一个小的 biopython 脚本,但它一次只能查看一个文件,我想同时计算每个文件的出现次数。谁能帮我改进脚本?
from Bio import SeqIO
adaptor = (rec for rec in \
SeqIO.parse("file.fastq", "fastq") \
if rec.seq.startswith("TGA"))`
count = SeqIO.write(adaptor, "adaptor.fastq", "fastq")
print("Saved %i adaptor" % count)
from Bio import SeqIO
fnames = ["file.fastq", "file1.fastq", "file2.fastq"]
for fname in fnames:
adaptor = (rec for rec in \
SeqIO.parse(fname, "fastq") \
if rec.seq.startswith("TGA"))
count = SeqIO.write(adaptor, "adaptor.fastq", "fastq")
print("Saved %i adaptor in file %s" %(count, fname))
我在一个文件夹中有 30 个 fastq 文件,我想知道我可以在这些文件中的哪个文件中找到特定的适配器(这样我就可以弄清楚它实际上是哪个样本)。
我写了一个小的 biopython 脚本,但它一次只能查看一个文件,我想同时计算每个文件的出现次数。谁能帮我改进脚本?
from Bio import SeqIO
adaptor = (rec for rec in \
SeqIO.parse("file.fastq", "fastq") \
if rec.seq.startswith("TGA"))`
count = SeqIO.write(adaptor, "adaptor.fastq", "fastq")
print("Saved %i adaptor" % count)
from Bio import SeqIO
fnames = ["file.fastq", "file1.fastq", "file2.fastq"]
for fname in fnames:
adaptor = (rec for rec in \
SeqIO.parse(fname, "fastq") \
if rec.seq.startswith("TGA"))
count = SeqIO.write(adaptor, "adaptor.fastq", "fastq")
print("Saved %i adaptor in file %s" %(count, fname))