终端 windows 和 txt 文件的输出不同？

Question

我正在尝试通过 Perl 合并两个文件。

目前代码数：

 my $hash_ref;  
 open (my $I_fh, "<", "File1.txt") or die $!;

 my $line = <$I_fh>;
 while ($line = <$I_fh>) {
 chomp $line;
 my @cols = split ("\t", $line);
 my $key = $cols[1];
 $hash_ref -> {$key} = \@cols;
 }
 close $I_fh;

 open (my $O_fh, "<", "File2.txt") or die $!;
 while ($line = <$O_fh>) {
 chomp $line;
 my @cols = split ("\t", $line);
 my $key = shift (@cols);
 push (@{$hash_ref -> {$key}}, @cols);

 }
 close $O_fh;


 open (my $out, ">", "merged.txt") or die $!;

 foreach my $key (sort keys %$hash_ref) {

 my $row = join ("\t", @{$hash_ref -> {$key}});

print $out "$key\t$row\n";
 }
close $out;

我正在使用 print 或 Dumper 功能来检查每个步骤。在终端windows，一切正常。但是，在我的输出文件（合并的 txt）中，格式发生了变化。我想通过添加更多列而不是添加更多行来合并两个文件。我该如何修复代码？

  File 1.txt:  
  Index    Name    Column1   Column2  
   1        A1                  AB      
   2        A2                  CD   
   3        B1                  EF    
   4        B2                  GH   


    File 2.txt:   
    Name  Type  
     A1     1  
     A2     1   
     B1     2   
     B2     1    

   Merged file:  

   A1   1   AB    
        1     
   A2   2   CD  
        1      
   B1   3   EF  
        2      
   B2   4   GH   
        1      

Wanted file:  
Name  Type  Column2  

  A1   1   AB    
  A2   1   CD   
  B1   2   EF   
  B2   1   GH

Answer 1

假设文件是根据名称列排序的，由于 join(1) 程序，这真的很容易做到：

$ join --header -t $'\t' -o 2.1,2.2,1.4 -1 2 -2 1 file1.tsv file2.tsv
Name    Type    Column2
A1  1   AB
A2  1   CD
B1  2   EF
B2  1   GH

--header 选项是一个 GNU 扩展，它不连接两个文件的第一行，而是将它们视为列标题。 -t 设置列分隔符，-o 控制输出中包含哪些列（FILE.COLUMN 说明符列表），-1 和 -2 选择列用于连接两个文件。

如果它们没有排序，或者如果您使用 perl，您的代码看起来非常接近；除了所有拼写错误等，您还要打印出每一列，而不仅仅是您想要的输出表明您关心的那些。考虑：

#!/usr/bin/perl
use warnings;
use strict;
use feature qw/say/;
use autodie;

my %names;

sub read_file {
  my ($file, $idx) = @_;
  open my $in, "<", $file;
  my $header = <$in>;
  while (<$in>) {
    chomp;
    my @F = split /\t/;
    push @{$names{$F[$idx]}}, \@F;
  }
}

read_file "file1.tsv", 1;
read_file "file2.tsv", 0;

say "Name\tType\tColumn2";
for my $n (sort keys %names) {
  my $row = $names{$n};
  say "$n\t$row->[1][1]\t$row->[0][3]";
}

我还怀疑您的奇怪输出可能是由于运行您的程序在数据文件上使用 Windows-style 行结尾而您的 OS 使用 Unix-style 行结尾。

终端 windows 和 txt 文件的输出不同？

The outputs were different in the terminal windows and txt file?

perl

hashref

perl-hash