结合文本文件中的两列并参考 Perl 中的第三列查找计数
Combination of two columns from text file and finding count with reference to third column in Perl
我正在尝试使用 Perl 读取 in.txt 文件并生成输出文件 out.txt。我尝试使用哈希,但没有得到准确的输出。
有没有办法在 Perl 中做到这一点。
合并两栏,在第三栏的基础上发表评论。
in.txt
Template,Account,Active
123456,123,N
123456,456,Y
321478,456,Y
123456,123,N
321478,456,Y
out.txt
Account,Template,Active,NotActive
123,123456,0,2
456,321478,2,0
456,123456,1,0
这不是 perl
解决方案,但它适用于 awk
:
AWK 1 行:
awk 'BEGIN{FS=OFS=",";print "Account,Template,Active,NotActive"}NR>1{if(=="Y"){a[ FS ]++}else{b[ FS ]++}}END{for(i in a){print i OFS a[i] OFS b[i]+0}for(u in b){if(b[u] && !a[u]){print u OFS a[u]+0 OFS b[u]}}}' input_file | sort -n
AWK 脚本:
# BEGIN rule(s)
BEGIN {
FS = OFS = "," #defines input/output field separator as ,
print "Account,Template,Active,NotActive" #print the header
}
# Rule(s)
NR > 1 { # from the 2nd line of the file
if ( == "Y") { # if the 3rd field is at Y
a[ FS ]++ #increment the array indexed by FS
} else {
b[ FS ]++ #do the same when N with the other array
}
}
# END rule(s)
END {
for (i in a) { # loop on all values of the arrays and print the content
print i OFS a[i] OFS (b[i] + 0)
}
for (u in b) {
if (b[u] && ! a[u]) { # do the same with the nonactive array and avoid double printing
print u OFS (a[u] + 0) OFS b[u]
}
}
} #pipe the output to a numerical sort to perform the proper ordering of the output
演示:
输入:
$ cat input_file
Template,Account,Active
123456,123,N
123456,456,Y
321478,456,Y
123456,123,N
321478,456,Y
123457,125,N
123457,125,Y
输出:
Account,Template,Active,NotActive
123,123456,0,2
125,123457,1,1
456,123456,1,0
456,321478,2,0
my $filename = 'input.txt';
my %yhash;
my %nhash;
if (open(my $ifh, '<:encoding(UTF-8)', $filename)) {
while (my $row = <$ifh>) {
next if ($row =~ /^#/m);
chomp $row;
my @values = split(',',$row);
my $value = join '',@values ;
my $lastchar = substr $value , -1;
my $firstval = substr $value ,0,9;
if ($lastchar eq "N"){
if (exists($nhash{firstval})){ $nhash{firstval}++; }
$nhash{$firstval}++;
}elsif($lastchar eq "Y"){
if (exists($yhash{firstval})){ $yhash{firstval}++; }
$yhash{$firstval}++;
}else{
print "nothin\n";
}
}
close $ifh;
} else {
warn "Could not open file '$filename' $!";
}
open(FH, '>', 'out.txt') or die $!;
print FH "Account,Template,Active,NotActive\n";
while (my ($key, $value) = each(%nhash)) {
my $account = substr $key ,6,3;
my $template = substr $key ,0,6;
my $active = "0";
my $notactive = "$value";
print FH "$account,$template,$active,$notactive \n";
}
while (my ($key, $value) = each(%yhash)) {
my $account = substr $key ,6,3;
my $template = substr $key ,0,6;
my $active = "$value";
my $notactive = "0";
print FH "$account,$template,$active,$notactive \n";
}
close (FH);
这也有效:
use strict;
use warnings;
my %data;
open my $fh, "<", "in.txt" or die $!;
while (my $line = <$fh>) {
chomp $line;
next if($line =~ /Account/);
my @line = split ',', $line;
$data{$line[1]}{$line[0]}{'Y'} = 0 if(!defined $data{$line[1]}{$line[0]}{'Y'});
$data{$line[1]}{$line[0]}{'N'} = 0 if(!defined $data{$line[1]}{$line[0]}{'N'});
$data{$line[1]}{$line[0]}{$line[2]} ++;
}
close $fh;
open my $FH, ">", "out.txt" or die $!;
print $FH "Account,Template,Active,NotActive\n";
foreach my $key (sort keys %data) {
foreach my $key2 (sort keys %{$data{$key}}) {
print $FH "$key,$key2,$data{$key}{$key2}{'Y'},$data{$key}{$key2}{'N'}\n";
}
}
close $FH;
你也可以替换这两行
$data{$line[1]}{$line[0]}{'Y'} = 0 if(!defined $data{$line[1]}{$line[0]}{'Y'});
$data{$line[1]}{$line[0]}{'N'} = 0 if(!defined $data{$line[1]}{$line[0]}{'N'});
和
$data{$line[1]}{$line[0]}{$_} //= 0 foreach ('Y', 'N');
我正在尝试使用 Perl 读取 in.txt 文件并生成输出文件 out.txt。我尝试使用哈希,但没有得到准确的输出。
有没有办法在 Perl 中做到这一点。
合并两栏,在第三栏的基础上发表评论。
in.txt
Template,Account,Active
123456,123,N
123456,456,Y
321478,456,Y
123456,123,N
321478,456,Y
out.txt
Account,Template,Active,NotActive
123,123456,0,2
456,321478,2,0
456,123456,1,0
这不是 perl
解决方案,但它适用于 awk
:
AWK 1 行:
awk 'BEGIN{FS=OFS=",";print "Account,Template,Active,NotActive"}NR>1{if(=="Y"){a[ FS ]++}else{b[ FS ]++}}END{for(i in a){print i OFS a[i] OFS b[i]+0}for(u in b){if(b[u] && !a[u]){print u OFS a[u]+0 OFS b[u]}}}' input_file | sort -n
AWK 脚本:
# BEGIN rule(s)
BEGIN {
FS = OFS = "," #defines input/output field separator as ,
print "Account,Template,Active,NotActive" #print the header
}
# Rule(s)
NR > 1 { # from the 2nd line of the file
if ( == "Y") { # if the 3rd field is at Y
a[ FS ]++ #increment the array indexed by FS
} else {
b[ FS ]++ #do the same when N with the other array
}
}
# END rule(s)
END {
for (i in a) { # loop on all values of the arrays and print the content
print i OFS a[i] OFS (b[i] + 0)
}
for (u in b) {
if (b[u] && ! a[u]) { # do the same with the nonactive array and avoid double printing
print u OFS (a[u] + 0) OFS b[u]
}
}
} #pipe the output to a numerical sort to perform the proper ordering of the output
演示:
输入:
$ cat input_file
Template,Account,Active
123456,123,N
123456,456,Y
321478,456,Y
123456,123,N
321478,456,Y
123457,125,N
123457,125,Y
输出:
Account,Template,Active,NotActive
123,123456,0,2
125,123457,1,1
456,123456,1,0
456,321478,2,0
my $filename = 'input.txt';
my %yhash;
my %nhash;
if (open(my $ifh, '<:encoding(UTF-8)', $filename)) {
while (my $row = <$ifh>) {
next if ($row =~ /^#/m);
chomp $row;
my @values = split(',',$row);
my $value = join '',@values ;
my $lastchar = substr $value , -1;
my $firstval = substr $value ,0,9;
if ($lastchar eq "N"){
if (exists($nhash{firstval})){ $nhash{firstval}++; }
$nhash{$firstval}++;
}elsif($lastchar eq "Y"){
if (exists($yhash{firstval})){ $yhash{firstval}++; }
$yhash{$firstval}++;
}else{
print "nothin\n";
}
}
close $ifh;
} else {
warn "Could not open file '$filename' $!";
}
open(FH, '>', 'out.txt') or die $!;
print FH "Account,Template,Active,NotActive\n";
while (my ($key, $value) = each(%nhash)) {
my $account = substr $key ,6,3;
my $template = substr $key ,0,6;
my $active = "0";
my $notactive = "$value";
print FH "$account,$template,$active,$notactive \n";
}
while (my ($key, $value) = each(%yhash)) {
my $account = substr $key ,6,3;
my $template = substr $key ,0,6;
my $active = "$value";
my $notactive = "0";
print FH "$account,$template,$active,$notactive \n";
}
close (FH);
这也有效:
use strict;
use warnings;
my %data;
open my $fh, "<", "in.txt" or die $!;
while (my $line = <$fh>) {
chomp $line;
next if($line =~ /Account/);
my @line = split ',', $line;
$data{$line[1]}{$line[0]}{'Y'} = 0 if(!defined $data{$line[1]}{$line[0]}{'Y'});
$data{$line[1]}{$line[0]}{'N'} = 0 if(!defined $data{$line[1]}{$line[0]}{'N'});
$data{$line[1]}{$line[0]}{$line[2]} ++;
}
close $fh;
open my $FH, ">", "out.txt" or die $!;
print $FH "Account,Template,Active,NotActive\n";
foreach my $key (sort keys %data) {
foreach my $key2 (sort keys %{$data{$key}}) {
print $FH "$key,$key2,$data{$key}{$key2}{'Y'},$data{$key}{$key2}{'N'}\n";
}
}
close $FH;
你也可以替换这两行
$data{$line[1]}{$line[0]}{'Y'} = 0 if(!defined $data{$line[1]}{$line[0]}{'Y'});
$data{$line[1]}{$line[0]}{'N'} = 0 if(!defined $data{$line[1]}{$line[0]}{'N'});
和
$data{$line[1]}{$line[0]}{$_} //= 0 foreach ('Y', 'N');