结合文本文件中的两列并参考 Perl 中的第三列查找计数

Combination of two columns from text file and finding count with reference to third column in Perl

我正在尝试使用 Perl 读取 in.txt 文件并生成输出文件 out.txt。我尝试使用哈希,但没有得到准确的输出。

有没有办法在 Perl 中做到这一点。

合并两栏,在第三栏的基础上发表评论。

in.txt

Template,Account,Active
123456,123,N
123456,456,Y
321478,456,Y
123456,123,N
321478,456,Y

out.txt

Account,Template,Active,NotActive
123,123456,0,2
456,321478,2,0
456,123456,1,0

这不是 perl 解决方案,但它适用于 awk:

AWK 1 行:

awk 'BEGIN{FS=OFS=",";print "Account,Template,Active,NotActive"}NR>1{if(=="Y"){a[ FS ]++}else{b[ FS ]++}}END{for(i in a){print i OFS a[i] OFS b[i]+0}for(u in b){if(b[u] && !a[u]){print u OFS a[u]+0 OFS b[u]}}}' input_file | sort -n

AWK 脚本:

# BEGIN rule(s)

BEGIN {
        FS = OFS = "," #defines input/output field separator as ,
        print "Account,Template,Active,NotActive" #print the header
}

# Rule(s)

NR > 1 { # from the 2nd line of the file
        if ( == "Y") { # if the 3rd field is at Y
                a[ FS ]++ #increment the array  indexed by  FS  
        } else {
                b[ FS ]++ #do the same when N with the other array
        }
}

# END rule(s)

END {
        for (i in a) { # loop on all values of the arrays and print the content
                print i OFS a[i] OFS (b[i] + 0)
        }
        for (u in b) {
                if (b[u] && ! a[u]) { # do the same with the nonactive array and avoid double printing
                        print u OFS (a[u] + 0) OFS b[u]
                }
        }
} #pipe the output to a numerical sort to perform the proper ordering of the output

演示:

输入:

$ cat input_file 
Template,Account,Active
123456,123,N
123456,456,Y
321478,456,Y
123456,123,N
321478,456,Y
123457,125,N
123457,125,Y

输出:

Account,Template,Active,NotActive
123,123456,0,2
125,123457,1,1
456,123456,1,0
456,321478,2,0
my $filename = 'input.txt';
my %yhash;
my %nhash;
if (open(my $ifh, '<:encoding(UTF-8)', $filename)) {
    while (my $row = <$ifh>) {
    next if ($row =~ /^#/m);
    chomp $row;
    my @values = split(',',$row);
    my $value = join '',@values ;
    my $lastchar = substr $value , -1;
    my $firstval = substr $value ,0,9;
    if ($lastchar eq "N"){
              if (exists($nhash{firstval})){ $nhash{firstval}++; }
              $nhash{$firstval}++;
    }elsif($lastchar eq "Y"){
              if (exists($yhash{firstval})){ $yhash{firstval}++; }
              $yhash{$firstval}++;
    }else{
             print "nothin\n";

    }
    }
    close $ifh;
    } else {
    warn "Could not open file '$filename' $!";
    }


   open(FH, '>', 'out.txt') or die $!;
   print FH "Account,Template,Active,NotActive\n";
   while (my ($key, $value) = each(%nhash)) {
    my $account = substr $key ,6,3;
    my $template = substr $key ,0,6;
    my $active = "0";
    my $notactive = "$value";
    print FH "$account,$template,$active,$notactive \n";
   }
  while (my ($key, $value) = each(%yhash)) {
    my $account = substr $key ,6,3;
    my $template = substr $key ,0,6;
    my $active = "$value";
    my $notactive = "0";
    print FH "$account,$template,$active,$notactive \n";
  }
  close (FH);

这也有效:

use strict;
use warnings;

my %data;
open my $fh, "<", "in.txt" or die $!;
while (my $line = <$fh>) {
    chomp $line;
    next if($line =~ /Account/);
    my @line = split ',', $line;
    $data{$line[1]}{$line[0]}{'Y'} = 0 if(!defined $data{$line[1]}{$line[0]}{'Y'});
    $data{$line[1]}{$line[0]}{'N'} = 0 if(!defined $data{$line[1]}{$line[0]}{'N'});
    $data{$line[1]}{$line[0]}{$line[2]} ++;
}
close $fh;
open my $FH, ">", "out.txt" or die $!;
    print $FH "Account,Template,Active,NotActive\n";
    foreach my $key (sort keys %data) {
        foreach my $key2 (sort keys %{$data{$key}}) {
            print $FH "$key,$key2,$data{$key}{$key2}{'Y'},$data{$key}{$key2}{'N'}\n";
        }
    }
close $FH;

你也可以替换这两行

$data{$line[1]}{$line[0]}{'Y'} = 0 if(!defined $data{$line[1]}{$line[0]}{'Y'});
$data{$line[1]}{$line[0]}{'N'} = 0 if(!defined $data{$line[1]}{$line[0]}{'N'});

$data{$line[1]}{$line[0]}{$_} //= 0 foreach ('Y', 'N');