Perl 使用正则表达式搜索多个关键字
Perl Search Multiple Keyword with Regex
我正在从文件中搜索关键字列表。我能够匹配整个关键字,但对于某些关键字,我需要匹配单词的第一部分。例如
DES
AES
https:// --- here it should match the word starting with https:// but my code considers the whole word and skips it.
例如,使用上述关键字,我只想从以下输入中匹配 DES
、DES
和 https://
:
DES some more words
DESTINY and more...
https://example.domain.com
http://anotherexample.domain.com # note that this line begins with http://, not https://
这是我目前尝试过的方法:
use warnings;
use strict;
open STDOUT, '>>', "my_stdout_file.txt";
#die qq[Usage: perl [=12=] <keyword-file> <search-file> <file-name>\n] unless @ARGV == 3;
my $filename = $ARGV[2];
chomp ($filename);
open my $fh, q[<], shift or die $!; --- This file handle Opening all the 3 arguments. I need to Open only 2.
my %keyword = map { chomp; $_ => 1 } <$fh>;
print "$fh\n";
while ( <> ) {
chomp;
my @words = split;
for ( my $i = 0; $i <= $#words; $i++ ) {
if ( $keyword{^$words[ $i ] } ) {
print "Keyword Found for file:$filename\n";
printf qq[$filename Line: %4d\tWord position: %4d\tKeyword: %s\n],
$., $i, $words[ $i ];
}
}
}
close ($fh);
这是我认为您要实现的目标的可行解决方案。如果没有请告诉我:
use warnings;
use strict;
use feature qw/ say /;
my %keywords;
while(<DATA>){
chomp;
my ($key) = split;
my $length = length($key);
$keywords{$key} = $length;
}
open my $in, '<', 'in.txt' or die $!;
while(<$in>){
chomp;
my $firstword = (split)[0];
for my $key (keys %keywords){
if ($firstword =~ m/$key/){
my $word = substr($firstword, 0, $keywords{$key});
say $word;
}
}
}
__DATA__
Keywords:-
DES
AES
https:// - here it should match the word starting with https:// but my code considers the whole word and skipping it.
输入文件包含:
here are some words over multiple
lines
that may or
may not match your keywords:
DES DEA AES SSE
FOO https:
https://example.domain.com
这会产生输出:
DES
https://
我正在从文件中搜索关键字列表。我能够匹配整个关键字,但对于某些关键字,我需要匹配单词的第一部分。例如
DES
AES
https:// --- here it should match the word starting with https:// but my code considers the whole word and skips it.
例如,使用上述关键字,我只想从以下输入中匹配 DES
、DES
和 https://
:
DES some more words
DESTINY and more...
https://example.domain.com
http://anotherexample.domain.com # note that this line begins with http://, not https://
这是我目前尝试过的方法:
use warnings;
use strict;
open STDOUT, '>>', "my_stdout_file.txt";
#die qq[Usage: perl [=12=] <keyword-file> <search-file> <file-name>\n] unless @ARGV == 3;
my $filename = $ARGV[2];
chomp ($filename);
open my $fh, q[<], shift or die $!; --- This file handle Opening all the 3 arguments. I need to Open only 2.
my %keyword = map { chomp; $_ => 1 } <$fh>;
print "$fh\n";
while ( <> ) {
chomp;
my @words = split;
for ( my $i = 0; $i <= $#words; $i++ ) {
if ( $keyword{^$words[ $i ] } ) {
print "Keyword Found for file:$filename\n";
printf qq[$filename Line: %4d\tWord position: %4d\tKeyword: %s\n],
$., $i, $words[ $i ];
}
}
}
close ($fh);
这是我认为您要实现的目标的可行解决方案。如果没有请告诉我:
use warnings;
use strict;
use feature qw/ say /;
my %keywords;
while(<DATA>){
chomp;
my ($key) = split;
my $length = length($key);
$keywords{$key} = $length;
}
open my $in, '<', 'in.txt' or die $!;
while(<$in>){
chomp;
my $firstword = (split)[0];
for my $key (keys %keywords){
if ($firstword =~ m/$key/){
my $word = substr($firstword, 0, $keywords{$key});
say $word;
}
}
}
__DATA__
Keywords:-
DES
AES
https:// - here it should match the word starting with https:// but my code considers the whole word and skipping it.
输入文件包含:
here are some words over multiple
lines
that may or
may not match your keywords:
DES DEA AES SSE
FOO https:
https://example.domain.com
这会产生输出:
DES
https://