"fuse" 不同指针指向的位置有什么好方法?

What's a good way to "fuse" the locations pointed to by different pointers?

我有很多指向内存中不同(或相同)位置的指针。我想实现一种机制,允许我们 "fuse" 给定指针子集指向的位置。

我现在正在使用 perl 5.6.1,但我愿意接受其他语言的实现。我在 perl 中想出了以下愚蠢的实现:

my $ref1 = ;
my $ref2 = ;
print "${$ref1} : ${$ref2}\n"; # <-- prints 1 : 2

fuse($ref1, $ref2);          # <-- Make $ref2 point to same location as $ref1
print "${$ref1} : ${$ref2}\n"; # <-- prints 1 : 1 (which is correct)

sub fuse
{
    ${$_[1]} = ${$_[0]}; 
}

但是当我们不得不多次融合时,这并没有像预期的那样工作:

my $ref1 = ;
my $ref2 = ;
my $ref3 = ;
print "${$ref1} : ${$ref2} : ${$ref3}\n"; # <-- prints 1 : 2 : 3

fuse($ref1, $ref2);                     # <-- Make $ref2 point to same location as $ref1
print "${$ref1} : ${$ref2} : ${$ref3}\n"; # <-- prints 1 : 1 : 3 (which is correct)

fuse($ref3, $ref1);                     # <-- Make $ref1 point to same location as $ref3
print "${$ref1} : ${$ref2} : ${$ref3}\n"; # <-- prints 3 : 1 : 3 ($ref2 is useless now)

sub fuse
{
    ${$_[1]} = ${$_[0]}; 
}

在上面的示例中,我希望所有三个变量 $ref1$ref2$ref3 最终指向包含 3 的位置。

有没有一种好方法可以完成这个 "fusing" 而无需手动重新分配我们想要更改其所指对象的每个指针?

上下文:
我正在尝试模拟一个电路(有电线)。当两个节点通过导线连接时,两个节点的属性之一(比方说电压)变得相同。当这些节点之一连接到第三个节点(用电线)时,所有三个节点的电压变得相同,无论它们之前的值是什么,并且只要连接存在就继续保持相同。

我尝试用谷歌搜索 HDL 如何实现连线失败了(我可能不知道该 google 是什么)。

在偶然发现这个名为 disjoint-set data structure 的奇妙事物之前,我几乎放弃了,它似乎是为解决这个确切问题而发明的。以下是我使用的代码:

use Scalar::Util qw( weaken );

my $ref1 = {}; $ref1->{voltage} = 1; weaken( $ref1->{parent} = $ref1 );
my $ref2 = {}; $ref2->{voltage} = 2; weaken( $ref2->{parent} = $ref2 );
my $ref3 = {}; $ref3->{voltage} = 3; weaken( $ref3->{parent} = $ref3 );
my $ref4 = {}; $ref4->{voltage} = 4; weaken( $ref4->{parent} = $ref4 );

print "@{[map(get_vol($_), ($ref1, $ref2, $ref3, $ref4))]}\n";
# Above line print 1 2 3 4

fuse($ref1, $ref2); # <-- Second argument gets set to first
print "@{[map(get_vol($_), ($ref1, $ref2, $ref3, $ref4))]}\n";
# Above line print 1 1 3 4

fuse($ref4, $ref3);
set_vol($ref3, 5);
print "@{[map(get_vol($_), ($ref1, $ref2, $ref3, $ref4))]}\n";
# Above line print 1 1 5 5

fuse($ref2, $ref3);
set_vol($ref3, 7);
print "@{[map(get_vol($_), ($ref1, $ref2, $ref3, $ref4))]}\n";
# Above line print 7 7 7 7


sub fuse
{
    my ($node1, $node2) = ($_[0], $_[1]);
    $node2 = $node2->{parent} while ($node2->{parent} != $node2);
    $node2->{parent} = $node1;
}

sub get_vol
{
    my $node = shift;
    $node = $node->{parent} while ($node != $node->{parent});
    return $node->{voltage};
}

sub set_vol
{
    my $node = shift;
    $node = $node->{parent} while ($node != $node->{parent});
    $node->{voltage} = shift;
}

在此之后,使用 set_vol 设置任何 $ref 将反映在所有其他 $refget_vol 输出中。

显然,我们可以在读取和设置电压方面添加其他优化,以便我们在读取或写入某些节点时不必遍历整个树。


Update:下面使用了上面的简单原理,但是没有使用weaken就避免了内存泄漏,并且它优化了电压查找(所以只有第一个保险丝后查找 "slow").

package Wire;

use strict;
use warnings qw( all );

sub new {
   my ($class, %args) = @_;
   my $voltage = $args{voltage} // 0;
   my $self = bless({}, $class);
   $self->{voltage_indirect_chain} = { next => undef, value => $voltage };
   return $self;
}

sub _tail {
   my ($self) = @_;
   $self->{voltage_indirect_chain} = $self->{voltage_indirect_chain}{next}
      while $self->{voltage_indirect_chain}{next};

   return $self->{voltage_indirect_chain};
}

sub get_voltage { $_[0]->_tail()->{value} }
sub set_voltage { $_[0]->_tail()->{value} = $_[1]; }

sub fuse {
   my ($self, $new) = @_;
   my $tail = $self->_tail();
   delete $tail->{value};
   $tail->{next} = $new->_tail();
}

1;

这个基本实现依赖于一个 class 属性,所有不相交的 "fused" 节点组,由它们的值作为键。每次融合都会根据需要更新和合并它们。

use warnings;
use strict;
use feature 'say';
use FindBin qw($RealBin);
use lib $RealBin;         # to load from ./
#use Data::Dump qw(dd);

use Nodes;

my $n1 = Nodes->new(volt => 10);
my $n2 = Nodes->new(volt => 20);
my $n3 = Nodes->new(volt => 30);
my $n4 = Nodes->new(volt => 40);

say "\nFuse n1 with (set to) n3:";
$n1->fuse_with($n3);  # n1 is now at same voltage as n3
say "\tvoltage for node ", $_->label, " is: ", $_->volt
    for ($n1, $n2, $n3, $n4);

say "\nFuse n4 with (set to) n2:";
$n4->fuse_with($n2);  # n4 is now same as n2
say "\tvoltage for node ", $_->label, " is: ", $_->volt
    for ($n1, $n2, $n3, $n4);

say "\nFuse n1 with (set to) n4:";
$n1->fuse_with($n4);  # n1 is now same as n4, and so are n2 and n3
say "\tvoltage for node ", $_->label, " is: ", $_->volt
    for ($n1, $n2, $n3, $n4);

# dd \%Nodes::Fused;

Nodes.pm

package Nodes;

use warnings;
use strict;
use feature 'say';    
#use Data::Dump qw(dd);

our $Label = 0;
our %Fused;   # disjoint groups ( value => { label => node, ... }, ... )

sub new {
    my ($class, %args) = @_;
    my $self = { _volt => $args{volt}, _label => ++$Label };  
    say "New node: volt = ", $self->{_volt}, ", label = ", $self->{_label};
    $Fused{$self->{_volt}} = { $self->{_label} => $self };
    return bless $self, $class;
}

sub volt {
    my ($self, $val) = @_; 
    $self->{_volt} = $val if $val;
    return $self->{_volt};
}

sub label { return $_[0]->{_label} }

sub fuse_with {
    my ($self, $node) = @_; 
    # Retrieve groups that have $self or $node
    my %groups = map { 
        ( exists $Fused{$_}->{$self->{_label}} or
          exists $Fused{$_}->{$node->label} )
            ? ($_ => $Fused{$_}) : ()  
    } keys %Fused;
    # Add these nodes if they are in no groups, or
    # Remove %groups from %Fused, fuse them into new one, update voltage
    if (not keys %groups) {
        $Fused{$node->volt}->{$_->label} = $_  for ($self, $node);
        $self->{_volt} = $node->volt;
    }   
    else {
        delete $Fused{$_} for keys %groups;
        $Fused{$node->volt} = { map { %{$groups{$_}} } keys %groups };
        $Fused{$node->volt}->{$node->label}    //= $node;  #/
        $Fused{$node->volt}->{$self->{_label}} //= $self;  #/
        $Fused{$node->volt}->{$_}->{_volt} = $node->volt  
            for keys %{$Fused{$node->volt}};
    }
    # dd \%Fused;
}   

sub cleanup {
    my ($self, $voltage) = @_;
    if ($voltage) {  # new voltage (and label) for the fused group
        $Fused{$voltage} = $Fused{$self->{_volt}};
        delete $Fused{$self->{_volt}};
        $Fused{$voltage}->{$_}->{_volt} = $voltage
            for keys %{$Fused{$voltage}};
    }
    $self->DESTROY;
}

# Must be called manually, via cleanup(), when object leaves scope
sub DESTROY {
    my ($self) = @_;
    return if ${^GLOBAL_PHASE} eq 'DESTRUCT';
    delete $Fused{$_}->{$self->{_label}}  for keys %Fused;
}       

return 1;

这会打印

New node: volt = 10, label = 1
New node: volt = 20, label = 2
New node: volt = 30, label = 3
New node: volt = 40, label = 4

Fuse n1 with (set to) n3:
        voltage for node 1 is: 30
        voltage for node 2 is: 20
        voltage for node 3 is: 30
        voltage for node 4 is: 40

Fuse n4 with (set to) n2:
        voltage for node 1 is: 30
        voltage for node 2 is: 20
        voltage for node 3 is: 30
        voltage for node 4 is: 20

Fuse n1 with (set to) n4:
        voltage for node 1 is: 20
        voltage for node 2 is: 20
        voltage for node 3 is: 20
        voltage for node 4 is: 20

取消注释(并添加)%Nodes::Fused 的印刷品以了解如何跟踪 "fused" 个组。

这种方法带有以下需求:如果要销毁对象(超出范围),则需要显式调用析构函数。为此提供了cleanup()方法

{ # lexical will go out of scope while the object is in fused groups
    my $n5 = Node->new(volt => 500);
    $n2->fuse_with($n5);
    $n5->cleanup(25);    # with new voltage for the group (optional)
}

原因正是方便的 class 属性,它保留对对象的引用,因此不会自动调用析构函数。

另一种方法是在每个对象中包含 "fused" 列表。如果有很多节点并且融合通常是昂贵的,因为每个对象必须重新处理整个列表,O(N2)。这是对电路建模时可能出现的情况,因此我保留了 class 属性。

更多评论

  • 这完成了它所需要的,但它缺少点点滴滴

  • 它依赖于一个class属性,如果涉及到这不是最干净的设计。它纠缠对象,创建一个全局实体,原则上反对对象的独立性

  • 缺少一些基本方法,特别是 "unfuse" 节点和独立设置新值(并在需要时更新所有融合节点)

  • 需要检查。需要一些低级优化

我相信

  • 您希望能够将一个融合集的任何部分与另一个融合集的任何部分融合。
  • 您希望能够设置值,以便更新融合集的每个部分。

这意味着以下程序定义了预期的行为:

use strict;
use warnings qw( all );
use feature qw( say );
use FindBin qw( $RealBin );
use lib $RealBin;

use Wire qw( );

my $o1 = Wire->new( voltage => 1 );
my $o2 = Wire->new( voltage => 2 );
my $o3 = Wire->new( voltage => 3 );
my $o4 = Wire->new( voltage => 4 );
say join " ", map $_->get_voltage(), $o1, $o2, $o3, $o4;  # 1 2 3 4

$o2->fuse($o1);
$o3->fuse($o4);
$o1->fuse($o3);
say join " ", map $_->get_voltage(), $o1, $o2, $o3, $o4;  # 4 4 4 4

$o1->set_voltage(5);
say join " ", map $_->get_voltage(), $o1, $o2, $o3, $o4;  # 5 5 5 5

$o3->set_voltage(6);
say join " ", map $_->get_voltage(), $o1, $o2, $o3, $o4;  # 6 6 6 6

这个 class 实现了:

package Wire;

use strict;
use warnings qw( all );

sub new {
   my ($class, %args) = @_;
   my $voltage = $args{voltage} // 0;
   my $self = bless({}, $class);
   $self->{shared_voltage} = { value => $voltage, backrefs => [] };
   push @{ $self->{shared_voltage}{backrefs} }, \( $self->{shared_voltage} );
   return $self;
}

sub get_voltage { $_[0]{shared_voltage}{value} }
sub set_voltage { $_[0]{shared_voltage}{value} = $_[1]; }

sub fuse {
   my ($self, $new) = @_;
   my $old_sv = $self->{shared_voltage};  my $old_sv_br = $old_sv->{backrefs};
   my $new_sv = $new->{shared_voltage};   my $new_sv_br = $new_sv->{backrefs};
   for my $backref (@$old_sv_br) {
      $$backref = $new_sv;
      push @$new_sv_br, $backref;
   }
}

sub DESTROY {
   my ($self) = @_;
   @{ $self->{shared_voltage}{backrefs} } =
      grep { $_ != \( $self->{shared_voltage} ) }
         @{ $self->{shared_voltage}{backrefs} };
}

1;

结果是通过将对融合节点的引用列表与共享值一起存储来实现的。这与 Perl 中的 Copy-on-Write 字符串使用的方法相同。融合结构如下所示:

+-$o1--+             +-Wire----------------+
| Ref -------------->| +-shared_voltage--+ |               +-anon hash------+
+------+   +---------->| Reference      ------------------>| +-value------+ |
           |         | +-----------------+ |   / / /       | | 4          | |
           |         +---------------------+   | | |       | +-backrefs---+ |
           |                                   | | |       | | Reference -------+
           |                                   | | |       | +------------+ |   |
+-$o2--+   |         +-Wire----------------+   | | |       +----------------+   |
| Ref -----(-------->| +-shared_voltage--+ |   | | |                            |
+------+   | +-------->| Reference      -------+ | |   +------------------------+
           | |       | +-----------------+ |     | |   |
           | |       +---------------------+     | |   |   +-anon array-----+
           | |                                   | |   +-->| +-0----------+ |
           | |                                   | |       | | Reference -------------+
+-$o3--+   | |       +-Wire----------------+     | |       | +-1----------+ |         |
| Ref -----(-(------>| +-shared_voltage--+ |     | |       | | Reference -----------+ |
+------+   | | +------>| Reference      ---------+ |       | +-2----------+ |       | |
           | | |     | +-----------------+ |       |       | | Reference ---------+ | |
           | | |     +---------------------+       |       | +-3----------+ |     | | |
           | | |                                   |       | | Reference -------+ | | |
           | | |                                   |       | +------------+ |   | | | |
+-$o4--+   | | |     +-Wire----------------+       |       +----------------+   | | | |
| Ref -----(-(-(---->| +-shared_voltage--+ |       |                            | | | |
+------+   | | | +---->| Reference      -----------+                            | | | |
           | | | |   | +-----------------+ |                                    | | | |
           | | | |   +---------------------+                                    | | | |
           | | | |                                                              | | | |
           | | | |                                                              | | | |
           | | | +--------------------------------------------------------------+ | | |
           | | +------------------------------------------------------------------+ | |
           | +----------------------------------------------------------------------+ |
           +--------------------------------------------------------------------------+

(backrefs 的顺序未准确表示。)

我想你会发现这在实践中比 快得多。就像你的一样,融合是 O(N)。但是,获取和设置电压的时间复杂度为 O(1) 而不是 O(N)。虽然在我的对象销毁是 O(N) 而不是 O(1),但可以通过使用散列而不是 backrefs 的数组来使其成为 O(1)。就是说。作为数组,它实际上可能更快。这就是 Perl 对 CoW 字符串所做的。 N 是熔断器的大小(在我们的测试用例中为 4)。