Perl Dumper 的意外输出

Question

当我遇到这种意外情况时，我正试图将一个哈希值分配给另一个哈希值。

我正在打印转储程序以验证哈希的格式是否正确。

当我遍历哈希时，Data::Dumper 确实提供了预期的输出，但当我打印整个哈希时，它显示了一些意外的结果。

请看下面的代码片段。任何见解都会有很大的帮助。

my (@aBugs) = (111,222,333);
my $phBugsRec;
my $phProfiles;
$phProfiles->{profiles} = { 'profile1' => 'default1' };

形成最终哈希：

foreach my $pBugNo(@aBugs){
    $phBugsRec->{bugAttributes}{$pBugNo}{totalEffort} = 0;
    $phBugsRec->{bugAttributes}{$pBugNo}{profiles}    = $phProfiles->{profiles};
}

如果我转储整个散列，我得不到预期的输出：

print '<pre>'.Dumper($phBugsRec).'</pre>';

$VAR1 = {
    'bugAttributes' => {
        '333' => {
            'totalEffort' => 0,
            'profiles' => {
                'profile1' => 'default1'
            }
        },
        '111' => {
            'totalEffort' => 0,
            'profiles' => $VAR1->{'bugAttributes'}{'333'}{'profiles'}
        },
        '222' => {
            'totalEffort' => 0,
            'profiles' => $VAR1->{'bugAttributes'}{'333'}{'profiles'}
        }
    }
};

但是当我遍历哈希时，我得到了预期的输出

foreach (sort keys $phBugsRec->{bugAttributes}){
    print '<pre>'.$_.':'.Dumper($phBugsRec->{bugAttributes}{$_}).'</pre>';
}

111:$VAR1 = {
  'totalEffort' => 0,
  'profiles' => {
    'profile1' => 'default1'
  }
};
222:$VAR1 = {
  'totalEffort' => 0,
  'profiles' => {
    'profile1' => 'default1'
  }
};
333:$VAR1 = {
  'totalEffort' => 0,
  'profiles' => {
    'profile1' => 'default1'
  }
};

Answer 1

正如，这并没有错。我同意这可能出乎意料。发生这种情况是因为您在数据结构中多次使用相同的引用。这是由于 Perl 的引用是如何工作的。

Perl 中参考的简短概述

参考文献在 perlref, perlreftut, perldsc and perllol 中进行了解释。

只要你在 Perl 中有一个多层次的数据结构，第一层次之后的所有层次都被存储为引用。 -> 运算符用于取消引用它们。 Perls 将它们转回散列或数组。如果你说 $foo->{bar}->{baz} 得到内部值，你基本上是在遍历一个数据结构。

如果您直接设置 $foo->{bar}->{baz} = 123，Perl 会自动为您创建所有这些引用。但是你也可以自己做参考

my @numbers = (42, 23, 1337);
my $ref = \@numbers;

print Dumper $ref;

__END__
$VAR1 = [ 42, 23, 1337 ]

这是对该数组的单一引用。如果你在同一个数据结构中多次使用它，它会显示。

my $hash = { 
    foo => $ref, 
    bar => $ref,
};

__END__

$VAR1 = {
      'foo' => [
                 42,
                 23,
                 1337
               ],
      'bar' => $VAR1->{'foo'}
};

看起来和你的例子一样，对吧？让我们试试别的。如果你在标量上下文中打印一个引用，Perl 会告诉你它的地址。

print "$ref";

__END__
ARRAY(0x25df7b0)

我们都看到了，我们都认为事情严重错误 当我们第一次看到它时。让我们回到上面的 $hash。

say $hash->{foo};
say $hash->{bar};

__END__
ARRAY(0x16257b0)
ARRAY(0x16257b0)

如你所见，它是相同的地址，因为它是相同的数据结构。

其他 Perl 序列化程序

这就是您的数据结构 Data::Dump 的样子。

do {
  my $a = {
    bugAttributes => {
      111 => { profiles => { profile1 => "default1" }, totalEffort => 0 },
      222 => { profiles => 'fix', totalEffort => 0 },
      333 => { profiles => 'fix', totalEffort => 0 },
    },
  };
  $a->{bugAttributes}{222}{profiles} = $a->{bugAttributes}{111}{profiles};
  $a->{bugAttributes}{333}{profiles} = $a->{bugAttributes}{111}{profiles};
  $a;
}
1

Data::Dump 用于创建人类可读且可以放回 Perl 的输出。它比 Data::Dumper 更简洁一些。您可以看到它还显示了在您的数据结构中多次使用的值。

这就是 Data::Printer 对它所做的。

\ {
    bugAttributes   {
        111   {
            profiles      {
                profile1   "default1"
            },
            totalEffort   0
        },
        222   {
            profiles      var{bugAttributes}{111}{profiles},
            totalEffort   0
        },
        333   {
            profiles      var{bugAttributes}{111}{profiles},
            totalEffort   0
        }
    }
}

Data::Printer 仅供人类使用。您不能运行这作为代码，而是为了易于阅读。同样，它还表明数据结构内部的内容被重用。

从所有这些得出的结论是，那些序列化程序之所以这样做，是因为要表明某些东西被重用并不容易。甚至当你在 Perl 中说它时也不行。

为什么看不到整个数据结构

如果 Perl 忽略数据结构的某些部分已被重用这一事实，则序列化将不可逆。读回它的结果会是别的东西。那当然不是你会做的。

不重复使用的序列化

为了表明您的数据实际上没有丢失，这实际上只是一种表明（和端口）数据结构内部重用的方法，我已将其转换为 JSON 使用JSON module，这是一种可以与 Perl 一起使用的可移植格式，但 不是 Perl。

use JSON 'encode_json';
say JSON->new->pretty->encode( $phBugsRec);

这是结果。它看起来更符合您的预期。

{
   "bugAttributes" : {
      "333" : {
         "profiles" : {
            "profile1" : "default1"
         },
         "totalEffort" : 0
      },
      "111" : {
         "totalEffort" : 0,
         "profiles" : {
            "profile1" : "default1"
         }
      },
      "222" : {
         "profiles" : {
            "profile1" : "default1"
         },
         "totalEffort" : 0
      }
   }
}

那是因为 JSON 是一种可移植格式。它用于移动数据。有 an agreement on what it can contain，重用数据不是其中的一部分。并非所有实现读写的语言JSON都支持部分数据结构的重用¹.

如果我们转换为 YAML 或 XML.

，它也会被打印两次

^{1) 我没有证据证明这一点，但它得到了重点}

Answer 2

使用

$Data::Dumper::Deepcopy = 1;
print Dumper($phBugsRec);

来自docs：

$Data::Dumper::Deepcopy or $OBJ->Deepcopy([NEWVAL])

Can be set to a boolean value to enable deep copies of structures. Cross-referencing will then only be done when absolutely essential (i.e., to break reference cycles). Default is 0.

然后输出为：

$VAR1 = {
  'bugAttributes' => {
       '222' => {
          'totalEffort' => 0,
          'profiles' => {
                          'profile1' => 'default1'
                        }
                },
       '333' => {
          'profiles' => {
                          'profile1' => 'default1'
                        },
          'totalEffort' => 0
                },
       '111' => {
          'profiles' => {
                          'profile1' => 'default1'
                        },
          'totalEffort' => 0
                }
     }
};

Perl Dumper 的意外输出

Unexpected output from Perl Dumper

perl

data-dumper