Perl,查找目录中文本文件之间的公共行

Perl, to find the common lines between text files in a Dir

我找到了读取目录中所有文本文件的代码。但我不知道如何找到它们之间的共性。请帮助我编写代码,或分享我需要进一步探索的领域。要学的东西太多但时间紧迫

use strict;
use warnings;
use English;

my $dir = 'C:\Perl_Example\Data';

foreach my $fp (glob("$dir/*.txt")) 
{
  printf "%s\n", $fp;
  #the file header
  
  open my $fh, "<", $fp or die "can't read open '$fp': $OS_ERROR";
  #open file to read which is each file in dir
  
  
  while (<$fh>) 
  {
    printf "  %s", $_;
    #print the file content
  }
  
  close $fh or die "can't read close '$fp': $OS_ERROR";
}

这是“找到共同点”的一种方法。当然,在 Perl 中有不止一种方法可以做到这一点 :)

#!/usr/bin/perl -w

my %h;

for my $file (@ARGV) {
    open (my $fh, $file) or die "$file: $!\n";
    while(<$fh>) {
        chomp;
        push @{$h{$_}}, $file;
    }
}

for (sort keys %h) {
    if(@{$h{$_}} > 1) {
        print "line <$_>\n";
        print "  occurs in ", join(", ", @{$h{$_}}), "\n";
    }
}

exit 0;

现在测试文件,命名为{1,2,3}:

% cat 1
PING YA.RU (87.250.250.242): 56 data bytes
64 bytes from 87.250.250.242: icmp_seq=0 ttl=249 time=14.615 ms
64 bytes from 87.250.250.242: icmp_seq=1 ttl=249 time=14.943 ms
64 bytes from 87.250.250.242: icmp_seq=2 ttl=249 time=14.381 ms
64 bytes from 87.250.250.242: icmp_seq=3 ttl=249 time=14.852 ms
64 bytes from 87.250.250.242: icmp_seq=4 ttl=249 time=14.791 ms
% cat 2
PING YA.RU (87.250.250.242): 56 data bytes
64 bytes from 87.250.250.242: icmp_seq=0 ttl=249 time=14.615 ms
64 bytes from 87.250.250.242: icmp_seq=3 ttl=249 time=14.852 ms
64 bytes from 87.250.250.242: icmp_seq=4 ttl=249 time=14.791 ms
% cat 3
64 bytes from 87.250.250.242: icmp_seq=3 ttl=249 time=14.852 ms
64 bytes from 87.250.250.242: icmp_seq=4 ttl=249 time=14.791 ms

并测试脚本的运行:

% ./try.pl 1 2 3
line <64 bytes from 87.250.250.242: icmp_seq=0 ttl=249 time=14.615 ms>
  occurs in 1, 2
line <64 bytes from 87.250.250.242: icmp_seq=3 ttl=249 time=14.852 ms>
  occurs in 1, 2, 3
line <64 bytes from 87.250.250.242: icmp_seq=4 ttl=249 time=14.791 ms>
  occurs in 1, 2, 3
line <PING YA.RU (87.250.250.242): 56 data bytes>
  occurs in 1, 2