无法获得最高版本的 TRADE in Perl

Unable to get highest version of the TRADE in Perl

我刚开始学习 Perl,陷入了不稳定的境地。输入源 XML 文件是:

<STATEMENT>
     <TRADE origin = "BANK", ref="1",version="1">
      <EVENT type="PRO">
       <EVENTNAR key = "USE" val = "MY"/>
       <EVENTNAR key = "USEE" val = "MYY"/>
      </EVENT>
     </TRADE>
     <TRADE origin = "BANK", ref="1",version="2">
      <EVENT type="PRO">
       <EVENTNAR key = "USE" val = "MYY"/>
       <EVENTNAR key = "USEE" val = "MYY"/>
      </EVENT>
     </TRADE>
     <TRADE origin = "BANK", ref="2",version="1">
      <EVENT type="PRO">
       <EVENTNAR key = "USE" val = "MY"/>
       <EVENTNAR key = "USEE" val = "MYY"/>
      </EVENT>
         <TRADE origin = "BANK" ref="1",version="1">
           <EVENT type="PRO">
              <EVENTNAR key = "USE" val = "MY"/>
              <EVENTNAR key = "USEE" val = "MYY"/>
           </EVENT>
         </TRADE>
       </TRADE>
    <STATEMENT>

现在我需要使用以下 'AND' 条件过滤交易:

  1. 仅交易来源 = "BANK"

  2. TRADE 应具有 <EVENT> 的 "type" 属性 = 'PRO'

  3. TRADE 应具有 <EVENTNAR> 的 "key" 属性 = "USE"

  4. TRADE 应具有 <EVENTNAR> = "MY"

  5. 的 "value" 属性
  6. <TRADE><EVENT> 下可以有多个 <EVENTNAR>。至少有一个 <EVENTNAR> 应该是合法的。

  7. 应删除所有子交易,即 TRADE 中的 TRADE :

  8. 最重要的 - 只能采用给定 ref 的最高版本号(这不起作用)

预期输出:

 <STATEMENT>
      <TRADE origin = "BANK", ref="1",version="2">(higher version)
        <EVENT type="PRO">
           <EVENTNAR key = "USE" val = "MYY"/>
           <EVENTNAR key = "USEE" val = "MYY"/>
        </EVENT>
      </TRADE>
      <TRADE origin = "BANK", ref="2",version="1">
        <EVENT type="PRO">
          <EVENTNAR key = "USE" val = "MY"/>
          <EVENTNAR key = "USEE" val = "MYY"/>
        </EVENT>
      </TRADE>
    <STATEMENT>

以下是我的代码:

use strict;
  use warnings;
  use XML::Twig;
  use Tie::File;


    my $SOURCEFILE=$ARGV[0];
    my $FILELOCATIONIN=$ARGV[1];


    open( my $out, '>:utf8', 'out.xml') or die "cannot create output file out.xml: $!";


    my $twig = XML::Twig->new(  pretty_print => 'indented',
      twig_handlers => { 'TRADE'=>\&TRADE_HANDLER,
                            'TRADE/TRADE' => \&DEL_TRADE},
                         att_accessors => [ qw/ ref version / ],

     );

    my %max_version;

    $twig->parsefile($FILELOCATIONIN.'/'.$SOURCEFILE.'.xml');


    for my $trade ($twig->root->children('TRADE')) {
      my ($ref, $version) = ($trade->ref, $trade->version);
    if ($version eq $max_version{$ref})
    {
     $trade->flush($out);
    }

    }

    sub DEL_TRADE{
    my ( $twig, $TRADE ) = @_;
    $TRADE->delete($TRADE);
    #$twig->purge();
    }


    sub TRADE_HANDLER {
        my ( $twig, $trade ) = @_;

        my $org   = $trade->att('origin');


     if ($org eq "BANK"  &&  grep {grep {$_->att('key') eq 'USE' and $_->att('value') eq 'MY'}
        $_->children('EVENTNAR')} $trade->children('EVENT[@type="PRO"]') )

    {
        my ($ref, $version) = ($trade->ref, $trade->version);

        unless (exists $max_version{$ref} and $max_version{$ref} >= $version) {
        $max_version{$ref} = $version;}

    }

    else
    {
    $twig->purge();
    }

    return ;
    }

我的输出是:

<STATEMENT>
      <TRADE origin = "BANK", ref="1",version="1">(this shouldn't come )
         <EVENT type="PRO">
          <EVENTNAR key = "USE" val = "MY"/>
          <EVENTNAR key = "USEE" val = "MYY"/>
         </EVENT>
       </TRADE>
       <TRADE origin = "BANK", ref="1",version="2">
        <EVENT type="PRO">
          <EVENTNAR key = "USE" val = "MYY"/>
          <EVENTNAR key = "USEE" val = "MYY"/>
         </EVENT>
       </TRADE>
    <TRADE origin = "BANK", ref="2",version="1">
        <EVENT type="PRO">
          <EVENTNAR key = "USE" val = "MY"/>
          <EVENTNAR key = "USEE" val = "MYY"/>
        </EVENT>
      </TRADE>
     </STATEMENT>

可以看出,给定 ref 的最高版本逻辑不起作用。

任何建议将不胜感激。

使用XML::XSH2,修正输入后:

open file.xml ;
rm //TRADE/TRADE ;
$l = //TRADE[@origin='BANK'][EVENT[@type='PRO'][EVENTNAR[@key='USE'][@val='MY']]] ;
$h := hash @ref $l ;
for my $ref in { keys %$h } {
    $trades = xsh:lookup('h', $ref);
    ls $trades[@version=xsh:max($trades/@version)] ;
} | cat > output1.xml ;

对于非常大的文件,您可以尝试流媒体接口:

$h = { {} } ;
stream :f file.xml :F /dev/null select TRADE {
    rm TRADE ;
    if (@origin='BANK'
        and EVENT[@type='PRO'][EVENTNAR[@key='USE'][@val='MY']]
       ) {
        $ref = @ref ;
        $record = xsh:lookup('h', $ref)/@version ;
        perl { $record ||= -1 } ;
        if (@version > $record) {
            $here = . ;
            perl { $h->{$ref} = $here } ;
        }
    }
} ;

create STATEMENT ;
for my $trade in { values %$h } mv $trade into STATEMENT ;
save :f output2.xml ;

在 MSWin 上,您必须使用 NUL 而不是 /dev/null。该程序仍然可能很耗内存——它需要记住整个输出。如果它太多了,你必须改变它来处理文件两次:在第一个 运行 中,它会记住每个 ref 的最大版本,在第二个 运行 中,它会输出。

$h = { {} } ;
stream :f file.xml :F /dev/null select TRADE {
    rm TRADE ;
    if (@origin='BANK' 
        and EVENT[@type='PRO'][EVENTNAR[@key='USE'][@val='MY']]
    ) {
        $ref = @ref ;
        $record = xsh:lookup('h', $ref) ;
        perl { $record ||= -1 } ;
        if (@version > $record) {
            $record = @version ;
            perl { $h->{$ref} = $record } ;
        }
    }
} ;

stream :f file.xml :F output3.xml select TRADE {
    rm TRADE ;
    if not(@origin = 'BANK'
           and EVENT[@type='PRO'][EVENTNAR[@key='USE'][@val='MY']]
           and xsh:lookup('h', @ref) = @version
    ) rm . ;
} ;

如果version+ref的组合是唯一的,可以简化最后的条件if not