结合文本文件中的两列并参考 Perl 中的第三列查找计数

Question

我正在尝试使用 Perl 读取 in.txt 文件并生成输出文件 out.txt。我尝试使用哈希，但没有得到准确的输出。

有没有办法在 Perl 中做到这一点。

合并两栏，在第三栏的基础上发表评论。

in.txt

Template,Account,Active
123456,123,N
123456,456,Y
321478,456,Y
123456,123,N
321478,456,Y

out.txt

Account,Template,Active,NotActive
123,123456,0,2
456,321478,2,0
456,123456,1,0

Answer 1

这不是 perl 解决方案，但它适用于 awk:

AWK 1 行:

awk 'BEGIN{FS=OFS=",";print "Account,Template,Active,NotActive"}NR>1{if(=="Y"){a[ FS ]++}else{b[ FS ]++}}END{for(i in a){print i OFS a[i] OFS b[i]+0}for(u in b){if(b[u] && !a[u]){print u OFS a[u]+0 OFS b[u]}}}' input_file | sort -n

AWK 脚本:

# BEGIN rule(s)

BEGIN {
        FS = OFS = "," #defines input/output field separator as ,
        print "Account,Template,Active,NotActive" #print the header
}

# Rule(s)

NR > 1 { # from the 2nd line of the file
        if ( == "Y") { # if the 3rd field is at Y
                a[ FS ]++ #increment the array  indexed by  FS  
        } else {
                b[ FS ]++ #do the same when N with the other array
        }
}

# END rule(s)

END {
        for (i in a) { # loop on all values of the arrays and print the content
                print i OFS a[i] OFS (b[i] + 0)
        }
        for (u in b) {
                if (b[u] && ! a[u]) { # do the same with the nonactive array and avoid double printing
                        print u OFS (a[u] + 0) OFS b[u]
                }
        }
} #pipe the output to a numerical sort to perform the proper ordering of the output

演示：

输入：

$ cat input_file 
Template,Account,Active
123456,123,N
123456,456,Y
321478,456,Y
123456,123,N
321478,456,Y
123457,125,N
123457,125,Y

输出：

Account,Template,Active,NotActive
123,123456,0,2
125,123457,1,1
456,123456,1,0
456,321478,2,0

Answer 2

my $filename = 'input.txt';
my %yhash;
my %nhash;
if (open(my $ifh, '<:encoding(UTF-8)', $filename)) {
    while (my $row = <$ifh>) {
    next if ($row =~ /^#/m);
    chomp $row;
    my @values = split(',',$row);
    my $value = join '',@values ;
    my $lastchar = substr $value , -1;
    my $firstval = substr $value ,0,9;
    if ($lastchar eq "N"){
              if (exists($nhash{firstval})){ $nhash{firstval}++; }
              $nhash{$firstval}++;
    }elsif($lastchar eq "Y"){
              if (exists($yhash{firstval})){ $yhash{firstval}++; }
              $yhash{$firstval}++;
    }else{
             print "nothin\n";

    }
    }
    close $ifh;
    } else {
    warn "Could not open file '$filename' $!";
    }


   open(FH, '>', 'out.txt') or die $!;
   print FH "Account,Template,Active,NotActive\n";
   while (my ($key, $value) = each(%nhash)) {
    my $account = substr $key ,6,3;
    my $template = substr $key ,0,6;
    my $active = "0";
    my $notactive = "$value";
    print FH "$account,$template,$active,$notactive \n";
   }
  while (my ($key, $value) = each(%yhash)) {
    my $account = substr $key ,6,3;
    my $template = substr $key ,0,6;
    my $active = "$value";
    my $notactive = "0";
    print FH "$account,$template,$active,$notactive \n";
  }
  close (FH);

Answer 3

这也有效：

use strict;
use warnings;

my %data;
open my $fh, "<", "in.txt" or die $!;
while (my $line = <$fh>) {
    chomp $line;
    next if($line =~ /Account/);
    my @line = split ',', $line;
    $data{$line[1]}{$line[0]}{'Y'} = 0 if(!defined $data{$line[1]}{$line[0]}{'Y'});
    $data{$line[1]}{$line[0]}{'N'} = 0 if(!defined $data{$line[1]}{$line[0]}{'N'});
    $data{$line[1]}{$line[0]}{$line[2]} ++;
}
close $fh;
open my $FH, ">", "out.txt" or die $!;
    print $FH "Account,Template,Active,NotActive\n";
    foreach my $key (sort keys %data) {
        foreach my $key2 (sort keys %{$data{$key}}) {
            print $FH "$key,$key2,$data{$key}{$key2}{'Y'},$data{$key}{$key2}{'N'}\n";
        }
    }
close $FH;

你也可以替换这两行

$data{$line[1]}{$line[0]}{'Y'} = 0 if(!defined $data{$line[1]}{$line[0]}{'Y'});
$data{$line[1]}{$line[0]}{'N'} = 0 if(!defined $data{$line[1]}{$line[0]}{'N'});

和

$data{$line[1]}{$line[0]}{$_} //= 0 foreach ('Y', 'N');

结合文本文件中的两列并参考 Perl 中的第三列查找计数

Combination of two columns from text file and finding count with reference to third column in Perl

regex

perl

hash-of-hashes