如何在 perl 中分解数组散列中的公共元素?

How do I breakdown common elements in hash of arrays in perl?

我试图在 Perl 中的数组散列中找到元素的任何交集

例如

my %test = (
                  Lot1 => [ "A","B","C"],
                  Lot2 => [ "A","B","C"],
                  Lot3 => ["C"],
                  Lot4 => ["E","F"],
            );

我想要的结果是

我认为这可以通过一个递归函数来完成,该函数可以有效地在数组中移动,如果发现两个数组之间的交集,它会递归地调用自己找到的交集和下一个数组。停止条件是 运行 out of arrays.

函数退出后,我将不得不遍历哈希以获取包含这些值的数组。

这听起来是个好方法吗?我一直在努力研究代码,但打算使用 List::Compare 来确定交集。

谢谢。

Array::Utils 有一个交集操作,您可以在其中测试两个数组的交集。但这只是您尝试做的事情的起点。

所以我认为您需要先反转查找:

my %member_of;

foreach my $key ( keys %test ) { 
    foreach my $element  ( @{$test{$key}} ) { 
         push ( @{$member_of{$element}}, $key ); 
    }
}
print Dumper \%member_of;

给予:

$VAR1 = {
          'A' => [
                   'Lot1',
                   'Lot2'
                 ],
          'F' => [
                   'Lot4'
                 ],
          'B' => [
                   'Lot1',
                   'Lot2'
                 ],
          'E' => [
                   'Lot4'
                 ],
          'C' => [
                   'Lot1',
                   'Lot2',
                   'Lot3'
                 ]
        };

然后将其折叠成一个键集:

my %new_set;
foreach my $element ( keys %member_of ) {
    my $set = join( ",", @{ $member_of{$element} } );
    push( @{ $new_set{$set} }, $element );
}
print Dumper \%new_set;

给予:

$VAR1 = {
          'Lot1,Lot2,Lot3' => [
                                'C'
                              ],
          'Lot1,Lot2' => [
                           'A',
                           'B'
                         ],
          'Lot4' => [
                      'E',
                      'F'
                    ]
        };

总的来说:

#!/usr/bin/env perl

use strict;
use warnings;
use Data::Dumper;

my %test = (
    Lot1 => [ "A", "B", "C" ],
    Lot2 => [ "A", "B", "C" ],
    Lot3 => ["C"],
    Lot4 => [ "E", "F" ],
);

my %member_of;
foreach my $key ( sort keys %test ) {
    foreach my $element ( @{ $test{$key} } ) {
        push( @{ $member_of{$element} }, $key );
    }
}

my %new_set;
foreach my $element ( sort keys %member_of ) {
    my $set = join( ",", @{ $member_of{$element} } );
    push( @{ $new_set{$set} }, $element );
}

foreach my $set ( sort keys %new_set ) {
    print "$set contains: ", join( ",", @{ $new_set{$set} } ), "\n";
}

我认为没有更有效的方法来解决它,因为您要将每个数组与其他数组进行比较,并从中形成一个新的复合键。

这给你:

Lot1,Lot2 contains: A,B
Lot1,Lot2,Lot3 contains: C
Lot4 contains: E,F

这可以通过两个简单的哈希转换来完成:

  • 构建一个散列,列出每个项目所在的所有批次

  • 将其转换为哈希,列出每个批次的所有 项目 组合

然后以方便的形式转储最后一个散列

这是代码。

use strict;
use warnings 'all';
use feature 'say';

my %test = (
    Lot1 => [ "A", "B", "C" ],
    Lot2 => [ "A", "B", "C" ],
    Lot3 => ["C"],
    Lot4 => [ "E", "F" ],
);

my %items;

for my $lot ( keys %test ) {
    for my $item ( @{ $test{$lot} } ) {
        push @{ $items{$item} }, $lot;
    }
}

my %lots;

for my $item ( keys %items ) {
    my $lots = join '!', sort @{ $items{$item} };
    push @{ $lots{$lots} }, $item;
}

for my $lots ( sort keys %lots ) {

    my @lots = split /!/, $lots;
    my $items = join '', @{ $lots{$lots} };

    $lots = join ', ', @lots;
    $lots =~ s/.*\K,/ and/;

    printf "%s %s %s\n", $lots, @lots > 1 ? 'have' : 'has', $items;
}

产出

Lot1 and Lot2 have AB
Lot1, Lot2 and Lot3 have C
Lot4 has EF

它生成一个 %items 散列,看起来像这样

{
  A => ["Lot2", "Lot1"],
  B => ["Lot2", "Lot1"],
  C => ["Lot2", "Lot3", "Lot1"],
  E => ["Lot4"],
  F => ["Lot4"],
}

然后 %lots 散列看起来像这样

{
  "Lot1!Lot2" => ["A", "B"],
  "Lot1!Lot2!Lot3" => ["C"],
  "Lot4" => ["E", "F"],
}