perl脚本删除下一行重复的单词
perl script to remove next line with duplicate words
输入:
DFF_2 : dff_0_2 port map(READY_c => READY_c, CT0 =>CT0);
\DFF_0\ : dff_0 port map(un1_CT1 => un1_CT1, CT2 =>CT2);
DFF_10 : dff_0_10 port map(MRVQN1 => MRVQN1, un1_CT2_1 =>GSMC_un1_CT2_1);
DFF_1 : dff_0_1 port map(un1_CT2_1 =>GSMC_un1_CT2_1);
DFF_1 : dff_0_1 port map(un1_CT2_1 =>un1_CT2_1);
预期输出1:
DFF_2 : dff_0_2 port map(READY_c => READY_c, CT0 =>CT0);
\DFF_0\ : dff_0 port map(un1_CT1 => un1_CT1, CT2 =>CT2);
DFF_10 : dff_0_10 port map(MRVQN1 => MRVQN1, un1_CT2_1 =>GSMC_un1_CT2_1);
DFF_1 : dff_0_1 port map(un1_CT2_1 =>un1_CT2_1);
预期输出2:(无需按顺序排列,但应恢复更新行)
DFF_1 : dff_0_1 port map(un1_CT2_1 =>un1_CT2_1);
DFF_10 : dff_0_10 port map(MRVQN1 => MRVQN1, un1_CT2_1 =>GSMC_un1_CT2_1);
\DFF_0\ : dff_0 port map(un1_CT1 => un1_CT1, CT2 =>CT2);
DFF_2 : dff_0_2 port map(READY_c => READY_c, CT0 =>CT0);
对于这种情况,我不能使用重复行删除 perl 脚本,因为字符串 word8 已用新字符串 word10 更新。我试过将内容反转并应用带有重复单词的行 removed.but,我的代码无法实现它。
open (IN, "<input.txt") or die;
open (OUT, ">output.txt") or die;
my @reverse = reverse <IN>;
foreach (@reverse){
print OUT "@reverse\n"; }
close (IN);
close (OUT);
output:
DFF_1 : dff_0_1 port map(un1_CT2_1 =>un1_CT2_1);
DFF_1 : dff_0_1 port map(un1_CT2_1 =>GSMC_un1_CT2_1);
DFF_10 : dff_0_10 port map(MRVQN1 => MRVQN1, un1_CT2_1 =>GSMC_un1_CT2_1);
\DFF_0\ : dff_0 port map(un1_CT1 => un1_CT1, CT2 =>CT2);
DFF_2 : dff_0_2 port map(READY_c => READY_c, CT0 =>CT0);
open (IN1, "<output.txt") or die;
open (OUT1, ">output1.txt") or die;
while (<IN1>){
my $save = "" if /(.+)\s\:/;
next if /$save\s/;
print OUT1 $_;}
close (IN1);
close (OUT1;
但它没有按预期生成输出文件。请帮助我。
试试这个正则表达式:
((line\d+)\s*:.*\n)
工作原理:
( # Capture line to be removed
(line\d+) # Capture Line Name / Number (Group #2)
\s* # Optional Whitespace
: # : (Colon)
.* # Line Data
\n # Newline Character at end of Line
)
# Next line starts with this Line Name (stored in Group #2)
使用散列来做。
虽然迭代循环尝试使用 :
拆分行,因此使用模式匹配拆分行 ^.+?\K\s:
^
为比赛开始
+?
有助于避免 +
.
的贪婪
\K
防止单词分裂。
然后将这两个数据存入$first
和$second
。通过 $first
值创建哈希键。它有助于删除重复项。最后将 uniq 值存储到 %hash
中,然后使用 grep
.
格式化散列
open my $fh,"<","one.txt";
my %hash;
while (<$fh>)
{
($first,$second) = split(/^.+?\K\s:/);
$hash{$first} = " : $second";
}
my @ar = grep{ $_ =$_.$hash{$_} }keys %hash;
print @ar;
输入:
DFF_2 : dff_0_2 port map(READY_c => READY_c, CT0 =>CT0);
\DFF_0\ : dff_0 port map(un1_CT1 => un1_CT1, CT2 =>CT2);
DFF_10 : dff_0_10 port map(MRVQN1 => MRVQN1, un1_CT2_1 =>GSMC_un1_CT2_1);
DFF_1 : dff_0_1 port map(un1_CT2_1 =>GSMC_un1_CT2_1);
DFF_1 : dff_0_1 port map(un1_CT2_1 =>un1_CT2_1);
预期输出1:
DFF_2 : dff_0_2 port map(READY_c => READY_c, CT0 =>CT0);
\DFF_0\ : dff_0 port map(un1_CT1 => un1_CT1, CT2 =>CT2);
DFF_10 : dff_0_10 port map(MRVQN1 => MRVQN1, un1_CT2_1 =>GSMC_un1_CT2_1);
DFF_1 : dff_0_1 port map(un1_CT2_1 =>un1_CT2_1);
预期输出2:(无需按顺序排列,但应恢复更新行)
DFF_1 : dff_0_1 port map(un1_CT2_1 =>un1_CT2_1);
DFF_10 : dff_0_10 port map(MRVQN1 => MRVQN1, un1_CT2_1 =>GSMC_un1_CT2_1);
\DFF_0\ : dff_0 port map(un1_CT1 => un1_CT1, CT2 =>CT2);
DFF_2 : dff_0_2 port map(READY_c => READY_c, CT0 =>CT0);
对于这种情况,我不能使用重复行删除 perl 脚本,因为字符串 word8 已用新字符串 word10 更新。我试过将内容反转并应用带有重复单词的行 removed.but,我的代码无法实现它。
open (IN, "<input.txt") or die;
open (OUT, ">output.txt") or die;
my @reverse = reverse <IN>;
foreach (@reverse){
print OUT "@reverse\n"; }
close (IN);
close (OUT);
output:
DFF_1 : dff_0_1 port map(un1_CT2_1 =>un1_CT2_1);
DFF_1 : dff_0_1 port map(un1_CT2_1 =>GSMC_un1_CT2_1);
DFF_10 : dff_0_10 port map(MRVQN1 => MRVQN1, un1_CT2_1 =>GSMC_un1_CT2_1);
\DFF_0\ : dff_0 port map(un1_CT1 => un1_CT1, CT2 =>CT2);
DFF_2 : dff_0_2 port map(READY_c => READY_c, CT0 =>CT0);
open (IN1, "<output.txt") or die;
open (OUT1, ">output1.txt") or die;
while (<IN1>){
my $save = "" if /(.+)\s\:/;
next if /$save\s/;
print OUT1 $_;}
close (IN1);
close (OUT1;
但它没有按预期生成输出文件。请帮助我。
试试这个正则表达式:
((line\d+)\s*:.*\n)
工作原理:
( # Capture line to be removed
(line\d+) # Capture Line Name / Number (Group #2)
\s* # Optional Whitespace
: # : (Colon)
.* # Line Data
\n # Newline Character at end of Line
)
# Next line starts with this Line Name (stored in Group #2)
使用散列来做。
虽然迭代循环尝试使用 :
拆分行,因此使用模式匹配拆分行 ^.+?\K\s:
^
为比赛开始
+?
有助于避免 +
.
\K
防止单词分裂。
然后将这两个数据存入$first
和$second
。通过 $first
值创建哈希键。它有助于删除重复项。最后将 uniq 值存储到 %hash
中,然后使用 grep
.
open my $fh,"<","one.txt";
my %hash;
while (<$fh>)
{
($first,$second) = split(/^.+?\K\s:/);
$hash{$first} = " : $second";
}
my @ar = grep{ $_ =$_.$hash{$_} }keys %hash;
print @ar;