在 perl 中调用哈希中的键
Calling upon keys within a hash in perl
这是对本网站 的扩展。
现在我有以下代码:
#!/usr/bin/perl
use strict;
use warnings;
#Print Directory
print "Please provide the directory containing the FASTQ files from your Illumina MiSeq run \n";
my $FASTQ = <STDIN>;
chomp ($FASTQ);
print "Please provide the minimum overlap between the two reads in bp";
my $min = <STDIN>;
chomp ($min);
print "Please provide the maximum overlap between the two reads in bp";
my $max = <STDIN>;
chomp ($max);
print "Now provide the output directory for your merged fastq reads";
my $output = <STDIN>;
chomp ($output);
#Open Directory
my $dir = $FASTQ;
opendir(DIR, $dir) or die "Cannot open $dir: $!";
my @reads = grep { /.fastq/ } readdir DIR;
closedir DIR;
sub parse_fastq_filename {
# Strip the suffix
my $filename = shift;
# Parse Sample-ID_Adapter-Sequence_L001_R1_001
my($sample_id, $adapter_sequence, $L001, $format, [=10=]1) = split /_/, $filename;
return {
filename => $filename,
sample_id => $sample_id,
adapter_sequence => $adapter_sequence,
$L001 => $L001,
format => $format,
001 => [=10=]1
};
}
# The pairs of files will be stored within the following hash.
my %pairs;
# List just the *.fastq files
for my $filename (@reads) {
# Parse the filename into a hash reference
my $fastq = parse_fastq_filename($filename);
# Put each parsed fastq filename into its pair
$pairs{ $fastq->{sample_id} }{ $fastq->{format} } = $fastq;
}
for my $sample (values %pairs) {
# Go through each pair in the sample
for my $fastq (values %$sample) {
print "$fastq->{filename} has format $fastq->{format}\n";
}
}
for my $forward (values %pairs) {
for my $fastq (values %$forward) {
}
}
#print the keys within the hash
foreach (keys %pairs){
print "$_ => $pairs{$_}\n";
}
#place the hash into an array
my @unique = keys %pairs;
print @unique;
#change directory to the user-inputted directory and merge reads
chdir $dir;
`/usr/local/flash/flash @array[0] @array[1] -m $min -M $max -d $output`;
我最后按照Unix命令的要求成功地将正向和反向fastq文件相互配对。现在我陷入了如何相互调用成对的各个组件的问题。
我考虑过通过输入密钥作为用户输入来访问散列,但这些数字是在脚本中随机生成的,我不想强制用户在 [=26= 之前输入这些值]宁 Unix 脚本。
此外,我检查过的所有示例都已经在散列中提供了固定数量的键。根据用户想要合并的文件数量,散列中的键数量也会有所不同。
我知道我想做一个循环,这样命令就可以运行多次;程序FLASH一次只能接受一个正向和反向fastq文件。因此,必须循环正向和反向 fastq 读取,以便处理每个读取。
如何在将它们配对成散列后取出我要使用的特定文件?
我最好的猜测是将给定目录中的所有 .fastq
文件配对,并为每对调用一次 flash
实用程序
您自己的代码存在许多问题,您似乎在 %pairs
哈希中保留了您需要的更多信息(我认为您只需要文件名、示例 ID、和格式)所以我写了这个
我还使用了一个简单的 glob
而不是 opendir
、readdir
、closedir
这似乎是一个更好的选择
我已经尽我所能进行了测试,看起来还不错
#!/usr/bin/perl
use strict;
use warnings;
use File::Basename qw/ basename /;
my $flash = '/usr/local/flash/flash';
print "Please provide the directory containing\n";
print "the fastq files from your Illumina MiSeq run: ";
my $fastq_file_dir = <STDIN>;
chomp $fastq_file_dir;
print "Please provide the minimum overlap between the two reads in bp: ";
my $min_overlap = <STDIN>;
chomp $min_overlap;
print "Please provide the maximum overlap between the two reads in bp: ";
my $max_overlap = <STDIN>;
chomp $max_overlap;
print "Now provide the output directory for your merged fastq reads: ";
my $out_dir = <STDIN>;
chomp $out_dir;
my @files = glob "$fastq_file_dir/*.fastq";
my %pairs;
for my $fastq_file ( @files ) {
my $file = basename $fastq_file;
my ($sample_id, $format) = (split /_/, $file)[0,3];
$pairs{ $sample_id }{ $format } = $file;
}
printf "Processing %d pairs of FASTQ files\n\n", scalar keys %pairs;
chdir $fastq_file_dir;
for my $sample ( sort keys %pairs ) {
my $pair = $pairs{$sample};
my ($forward, $reverse) = @{$pair}{qw/ R1 R2 /};
print "Forward: $forward\n";
print "Reverse: $reverse\n";
print "\n";
my $cmd = qq{$flash $forward $reverse -m $min_overlap -M $max_overlap -d $out_dir};
system $cmd;
}
这是对本网站
现在我有以下代码:
#!/usr/bin/perl
use strict;
use warnings;
#Print Directory
print "Please provide the directory containing the FASTQ files from your Illumina MiSeq run \n";
my $FASTQ = <STDIN>;
chomp ($FASTQ);
print "Please provide the minimum overlap between the two reads in bp";
my $min = <STDIN>;
chomp ($min);
print "Please provide the maximum overlap between the two reads in bp";
my $max = <STDIN>;
chomp ($max);
print "Now provide the output directory for your merged fastq reads";
my $output = <STDIN>;
chomp ($output);
#Open Directory
my $dir = $FASTQ;
opendir(DIR, $dir) or die "Cannot open $dir: $!";
my @reads = grep { /.fastq/ } readdir DIR;
closedir DIR;
sub parse_fastq_filename {
# Strip the suffix
my $filename = shift;
# Parse Sample-ID_Adapter-Sequence_L001_R1_001
my($sample_id, $adapter_sequence, $L001, $format, [=10=]1) = split /_/, $filename;
return {
filename => $filename,
sample_id => $sample_id,
adapter_sequence => $adapter_sequence,
$L001 => $L001,
format => $format,
001 => [=10=]1
};
}
# The pairs of files will be stored within the following hash.
my %pairs;
# List just the *.fastq files
for my $filename (@reads) {
# Parse the filename into a hash reference
my $fastq = parse_fastq_filename($filename);
# Put each parsed fastq filename into its pair
$pairs{ $fastq->{sample_id} }{ $fastq->{format} } = $fastq;
}
for my $sample (values %pairs) {
# Go through each pair in the sample
for my $fastq (values %$sample) {
print "$fastq->{filename} has format $fastq->{format}\n";
}
}
for my $forward (values %pairs) {
for my $fastq (values %$forward) {
}
}
#print the keys within the hash
foreach (keys %pairs){
print "$_ => $pairs{$_}\n";
}
#place the hash into an array
my @unique = keys %pairs;
print @unique;
#change directory to the user-inputted directory and merge reads
chdir $dir;
`/usr/local/flash/flash @array[0] @array[1] -m $min -M $max -d $output`;
我最后按照Unix命令的要求成功地将正向和反向fastq文件相互配对。现在我陷入了如何相互调用成对的各个组件的问题。
我考虑过通过输入密钥作为用户输入来访问散列,但这些数字是在脚本中随机生成的,我不想强制用户在 [=26= 之前输入这些值]宁 Unix 脚本。
此外,我检查过的所有示例都已经在散列中提供了固定数量的键。根据用户想要合并的文件数量,散列中的键数量也会有所不同。
我知道我想做一个循环,这样命令就可以运行多次;程序FLASH一次只能接受一个正向和反向fastq文件。因此,必须循环正向和反向 fastq 读取,以便处理每个读取。
如何在将它们配对成散列后取出我要使用的特定文件?
我最好的猜测是将给定目录中的所有 .fastq
文件配对,并为每对调用一次 flash
实用程序
您自己的代码存在许多问题,您似乎在 %pairs
哈希中保留了您需要的更多信息(我认为您只需要文件名、示例 ID、和格式)所以我写了这个
我还使用了一个简单的 glob
而不是 opendir
、readdir
、closedir
这似乎是一个更好的选择
我已经尽我所能进行了测试,看起来还不错
#!/usr/bin/perl
use strict;
use warnings;
use File::Basename qw/ basename /;
my $flash = '/usr/local/flash/flash';
print "Please provide the directory containing\n";
print "the fastq files from your Illumina MiSeq run: ";
my $fastq_file_dir = <STDIN>;
chomp $fastq_file_dir;
print "Please provide the minimum overlap between the two reads in bp: ";
my $min_overlap = <STDIN>;
chomp $min_overlap;
print "Please provide the maximum overlap between the two reads in bp: ";
my $max_overlap = <STDIN>;
chomp $max_overlap;
print "Now provide the output directory for your merged fastq reads: ";
my $out_dir = <STDIN>;
chomp $out_dir;
my @files = glob "$fastq_file_dir/*.fastq";
my %pairs;
for my $fastq_file ( @files ) {
my $file = basename $fastq_file;
my ($sample_id, $format) = (split /_/, $file)[0,3];
$pairs{ $sample_id }{ $format } = $file;
}
printf "Processing %d pairs of FASTQ files\n\n", scalar keys %pairs;
chdir $fastq_file_dir;
for my $sample ( sort keys %pairs ) {
my $pair = $pairs{$sample};
my ($forward, $reverse) = @{$pair}{qw/ R1 R2 /};
print "Forward: $forward\n";
print "Reverse: $reverse\n";
print "\n";
my $cmd = qq{$flash $forward $reverse -m $min_overlap -M $max_overlap -d $out_dir};
system $cmd;
}