Perl：搜索和替换

Question

我正在尝试改进我的脚本，希望在其中匹配 input.txt 中的字符（第 4 列：H1、2HB、CA、HB3) 到 dictionary.txt 并替换为 dictionary.txt 中的适当字符（第 2 列：H、HB、C、3HB）。使用 dictionary.txt 作为字典：

input.txt

1  N  22  H1  MET
1  H  32 2HB  MET
1  C  40  CA  MET
2  H  35  HB3 ASP

dictionary.txt

MET  H   H1
MET  HB 2HB
MET  C   CA
ASP 3HB  HB3

输出

1  N  22  H  MET
1  H  32  HB MET
1  C  40  C  MET
2  H  35 3HB ASP

我试图通过首先匹配 input.txt (MET) 和 dictionary.txt (MET) 中的单词然后执行替换来解决这个问题。这是我到目前为止写的：

#!/usr/bin/perl

use strict;
use warnings;

my %dictionary;

open my $dic_fh, '<', 'dictionary.txt' or die "Can't open file: $!";

while (my $ref = <$dic_fh>) {
    chomp $ref;
    my @columns  = split(/\t/, $ref);
    my $res_name = $columns[0];
    my $ref_nuc  = $columns[1];
    $dictionary{$res_name} = {$ref_nuc};

    open my $in_fh, '<', 'input.txt' or die "Can't open file: $!";

    while (my $line = <$in_fh>) {
        chomp $line;
        my @columns = split(/\t/, $line);
        my @name = $columns[3];
        if (my $name eq $res_name) {
            my $line = $_;
            foreach my $res_name (keys %dictionary) {
                $line =~ s/$name/$dictionary{$ref_nuc}/;
            }
            print $line;
        }
    }
}

Answer 1

问题似乎是您将单个字段 $columns[3] 分配给数组 @name，然后期望在 $name 中找到它，这完全是一个单独的变量。您甚至在比较点

声明 $name

你也在执行语句

$line =~ s/$name/$dictionary{$ref_nuc}/;

哈希中的每个键一次。这是不必要的：它只需要完成一次。最好将 $columns[3] 的值更改为 $dictionary{$columns[3]} 而不是在整行上进行搜索和替换，因为目标字符串可能出现在您不想修改的其他列中

通过构建字典散列并将输入文件的第四个字段替换为其字典查找非常简单

use strict;
use warnings;
use 5.010;
use autodie;

open my $fh, '<', 'dictionary.txt';
my %dict;
while ( <$fh> ) {
  my ($k, $v) = (split)[2,1];
  $dict{$k} = $v;
}

open $fh, '<', 'input.txt';
while ( <$fh> ) {
  my @fields = split;
  $fields[3] = $dict{$fields[3]};
  say join "\t", @fields;
}

输出

1   N   22  H   MET
1   H   32  HB  MET
1   C   40  C   MET
2   H   35  3HB ASP

Perl：搜索和替换

Perl: Search and Replace

perl

dictionary

substitution