无法获得最高版本的 TRADE in Perl
Unable to get highest version of the TRADE in Perl
我刚开始学习 Perl,陷入了不稳定的境地。输入源 XML 文件是:
<STATEMENT>
<TRADE origin = "BANK", ref="1",version="1">
<EVENT type="PRO">
<EVENTNAR key = "USE" val = "MY"/>
<EVENTNAR key = "USEE" val = "MYY"/>
</EVENT>
</TRADE>
<TRADE origin = "BANK", ref="1",version="2">
<EVENT type="PRO">
<EVENTNAR key = "USE" val = "MYY"/>
<EVENTNAR key = "USEE" val = "MYY"/>
</EVENT>
</TRADE>
<TRADE origin = "BANK", ref="2",version="1">
<EVENT type="PRO">
<EVENTNAR key = "USE" val = "MY"/>
<EVENTNAR key = "USEE" val = "MYY"/>
</EVENT>
<TRADE origin = "BANK" ref="1",version="1">
<EVENT type="PRO">
<EVENTNAR key = "USE" val = "MY"/>
<EVENTNAR key = "USEE" val = "MYY"/>
</EVENT>
</TRADE>
</TRADE>
<STATEMENT>
现在我需要使用以下 'AND' 条件过滤交易:
仅交易来源 = "BANK"
TRADE 应具有 <EVENT>
的 "type" 属性 = 'PRO'
TRADE 应具有 <EVENTNAR>
的 "key" 属性 = "USE"
TRADE 应具有 <EVENTNAR>
= "MY"
的 "value" 属性
<TRADE>
的 <EVENT>
下可以有多个 <EVENTNAR>
。至少有一个 <EVENTNAR>
应该是合法的。
应删除所有子交易,即 TRADE 中的 TRADE :
最重要的 - 只能采用给定 ref 的最高版本号(这不起作用)
预期输出:
<STATEMENT>
<TRADE origin = "BANK", ref="1",version="2">(higher version)
<EVENT type="PRO">
<EVENTNAR key = "USE" val = "MYY"/>
<EVENTNAR key = "USEE" val = "MYY"/>
</EVENT>
</TRADE>
<TRADE origin = "BANK", ref="2",version="1">
<EVENT type="PRO">
<EVENTNAR key = "USE" val = "MY"/>
<EVENTNAR key = "USEE" val = "MYY"/>
</EVENT>
</TRADE>
<STATEMENT>
以下是我的代码:
use strict;
use warnings;
use XML::Twig;
use Tie::File;
my $SOURCEFILE=$ARGV[0];
my $FILELOCATIONIN=$ARGV[1];
open( my $out, '>:utf8', 'out.xml') or die "cannot create output file out.xml: $!";
my $twig = XML::Twig->new( pretty_print => 'indented',
twig_handlers => { 'TRADE'=>\&TRADE_HANDLER,
'TRADE/TRADE' => \&DEL_TRADE},
att_accessors => [ qw/ ref version / ],
);
my %max_version;
$twig->parsefile($FILELOCATIONIN.'/'.$SOURCEFILE.'.xml');
for my $trade ($twig->root->children('TRADE')) {
my ($ref, $version) = ($trade->ref, $trade->version);
if ($version eq $max_version{$ref})
{
$trade->flush($out);
}
}
sub DEL_TRADE{
my ( $twig, $TRADE ) = @_;
$TRADE->delete($TRADE);
#$twig->purge();
}
sub TRADE_HANDLER {
my ( $twig, $trade ) = @_;
my $org = $trade->att('origin');
if ($org eq "BANK" && grep {grep {$_->att('key') eq 'USE' and $_->att('value') eq 'MY'}
$_->children('EVENTNAR')} $trade->children('EVENT[@type="PRO"]') )
{
my ($ref, $version) = ($trade->ref, $trade->version);
unless (exists $max_version{$ref} and $max_version{$ref} >= $version) {
$max_version{$ref} = $version;}
}
else
{
$twig->purge();
}
return ;
}
我的输出是:
<STATEMENT>
<TRADE origin = "BANK", ref="1",version="1">(this shouldn't come )
<EVENT type="PRO">
<EVENTNAR key = "USE" val = "MY"/>
<EVENTNAR key = "USEE" val = "MYY"/>
</EVENT>
</TRADE>
<TRADE origin = "BANK", ref="1",version="2">
<EVENT type="PRO">
<EVENTNAR key = "USE" val = "MYY"/>
<EVENTNAR key = "USEE" val = "MYY"/>
</EVENT>
</TRADE>
<TRADE origin = "BANK", ref="2",version="1">
<EVENT type="PRO">
<EVENTNAR key = "USE" val = "MY"/>
<EVENTNAR key = "USEE" val = "MYY"/>
</EVENT>
</TRADE>
</STATEMENT>
可以看出,给定 ref 的最高版本逻辑不起作用。
任何建议将不胜感激。
使用XML::XSH2,修正输入后:
open file.xml ;
rm //TRADE/TRADE ;
$l = //TRADE[@origin='BANK'][EVENT[@type='PRO'][EVENTNAR[@key='USE'][@val='MY']]] ;
$h := hash @ref $l ;
for my $ref in { keys %$h } {
$trades = xsh:lookup('h', $ref);
ls $trades[@version=xsh:max($trades/@version)] ;
} | cat > output1.xml ;
对于非常大的文件,您可以尝试流媒体接口:
$h = { {} } ;
stream :f file.xml :F /dev/null select TRADE {
rm TRADE ;
if (@origin='BANK'
and EVENT[@type='PRO'][EVENTNAR[@key='USE'][@val='MY']]
) {
$ref = @ref ;
$record = xsh:lookup('h', $ref)/@version ;
perl { $record ||= -1 } ;
if (@version > $record) {
$here = . ;
perl { $h->{$ref} = $here } ;
}
}
} ;
create STATEMENT ;
for my $trade in { values %$h } mv $trade into STATEMENT ;
save :f output2.xml ;
在 MSWin 上,您必须使用 NUL
而不是 /dev/null
。该程序仍然可能很耗内存——它需要记住整个输出。如果它太多了,你必须改变它来处理文件两次:在第一个 运行 中,它会记住每个 ref 的最大版本,在第二个 运行 中,它会输出。
$h = { {} } ;
stream :f file.xml :F /dev/null select TRADE {
rm TRADE ;
if (@origin='BANK'
and EVENT[@type='PRO'][EVENTNAR[@key='USE'][@val='MY']]
) {
$ref = @ref ;
$record = xsh:lookup('h', $ref) ;
perl { $record ||= -1 } ;
if (@version > $record) {
$record = @version ;
perl { $h->{$ref} = $record } ;
}
}
} ;
stream :f file.xml :F output3.xml select TRADE {
rm TRADE ;
if not(@origin = 'BANK'
and EVENT[@type='PRO'][EVENTNAR[@key='USE'][@val='MY']]
and xsh:lookup('h', @ref) = @version
) rm . ;
} ;
如果version+ref的组合是唯一的,可以简化最后的条件if not
。
我刚开始学习 Perl,陷入了不稳定的境地。输入源 XML 文件是:
<STATEMENT>
<TRADE origin = "BANK", ref="1",version="1">
<EVENT type="PRO">
<EVENTNAR key = "USE" val = "MY"/>
<EVENTNAR key = "USEE" val = "MYY"/>
</EVENT>
</TRADE>
<TRADE origin = "BANK", ref="1",version="2">
<EVENT type="PRO">
<EVENTNAR key = "USE" val = "MYY"/>
<EVENTNAR key = "USEE" val = "MYY"/>
</EVENT>
</TRADE>
<TRADE origin = "BANK", ref="2",version="1">
<EVENT type="PRO">
<EVENTNAR key = "USE" val = "MY"/>
<EVENTNAR key = "USEE" val = "MYY"/>
</EVENT>
<TRADE origin = "BANK" ref="1",version="1">
<EVENT type="PRO">
<EVENTNAR key = "USE" val = "MY"/>
<EVENTNAR key = "USEE" val = "MYY"/>
</EVENT>
</TRADE>
</TRADE>
<STATEMENT>
现在我需要使用以下 'AND' 条件过滤交易:
仅交易来源 = "BANK"
TRADE 应具有
<EVENT>
的 "type" 属性 = 'PRO'TRADE 应具有
<EVENTNAR>
的 "key" 属性 = "USE"TRADE 应具有
<EVENTNAR>
= "MY" 的 "value" 属性
<TRADE>
的<EVENT>
下可以有多个<EVENTNAR>
。至少有一个<EVENTNAR>
应该是合法的。应删除所有子交易,即 TRADE 中的 TRADE :
最重要的 - 只能采用给定 ref 的最高版本号(这不起作用)
预期输出:
<STATEMENT>
<TRADE origin = "BANK", ref="1",version="2">(higher version)
<EVENT type="PRO">
<EVENTNAR key = "USE" val = "MYY"/>
<EVENTNAR key = "USEE" val = "MYY"/>
</EVENT>
</TRADE>
<TRADE origin = "BANK", ref="2",version="1">
<EVENT type="PRO">
<EVENTNAR key = "USE" val = "MY"/>
<EVENTNAR key = "USEE" val = "MYY"/>
</EVENT>
</TRADE>
<STATEMENT>
以下是我的代码:
use strict;
use warnings;
use XML::Twig;
use Tie::File;
my $SOURCEFILE=$ARGV[0];
my $FILELOCATIONIN=$ARGV[1];
open( my $out, '>:utf8', 'out.xml') or die "cannot create output file out.xml: $!";
my $twig = XML::Twig->new( pretty_print => 'indented',
twig_handlers => { 'TRADE'=>\&TRADE_HANDLER,
'TRADE/TRADE' => \&DEL_TRADE},
att_accessors => [ qw/ ref version / ],
);
my %max_version;
$twig->parsefile($FILELOCATIONIN.'/'.$SOURCEFILE.'.xml');
for my $trade ($twig->root->children('TRADE')) {
my ($ref, $version) = ($trade->ref, $trade->version);
if ($version eq $max_version{$ref})
{
$trade->flush($out);
}
}
sub DEL_TRADE{
my ( $twig, $TRADE ) = @_;
$TRADE->delete($TRADE);
#$twig->purge();
}
sub TRADE_HANDLER {
my ( $twig, $trade ) = @_;
my $org = $trade->att('origin');
if ($org eq "BANK" && grep {grep {$_->att('key') eq 'USE' and $_->att('value') eq 'MY'}
$_->children('EVENTNAR')} $trade->children('EVENT[@type="PRO"]') )
{
my ($ref, $version) = ($trade->ref, $trade->version);
unless (exists $max_version{$ref} and $max_version{$ref} >= $version) {
$max_version{$ref} = $version;}
}
else
{
$twig->purge();
}
return ;
}
我的输出是:
<STATEMENT>
<TRADE origin = "BANK", ref="1",version="1">(this shouldn't come )
<EVENT type="PRO">
<EVENTNAR key = "USE" val = "MY"/>
<EVENTNAR key = "USEE" val = "MYY"/>
</EVENT>
</TRADE>
<TRADE origin = "BANK", ref="1",version="2">
<EVENT type="PRO">
<EVENTNAR key = "USE" val = "MYY"/>
<EVENTNAR key = "USEE" val = "MYY"/>
</EVENT>
</TRADE>
<TRADE origin = "BANK", ref="2",version="1">
<EVENT type="PRO">
<EVENTNAR key = "USE" val = "MY"/>
<EVENTNAR key = "USEE" val = "MYY"/>
</EVENT>
</TRADE>
</STATEMENT>
可以看出,给定 ref 的最高版本逻辑不起作用。
任何建议将不胜感激。
使用XML::XSH2,修正输入后:
open file.xml ;
rm //TRADE/TRADE ;
$l = //TRADE[@origin='BANK'][EVENT[@type='PRO'][EVENTNAR[@key='USE'][@val='MY']]] ;
$h := hash @ref $l ;
for my $ref in { keys %$h } {
$trades = xsh:lookup('h', $ref);
ls $trades[@version=xsh:max($trades/@version)] ;
} | cat > output1.xml ;
对于非常大的文件,您可以尝试流媒体接口:
$h = { {} } ;
stream :f file.xml :F /dev/null select TRADE {
rm TRADE ;
if (@origin='BANK'
and EVENT[@type='PRO'][EVENTNAR[@key='USE'][@val='MY']]
) {
$ref = @ref ;
$record = xsh:lookup('h', $ref)/@version ;
perl { $record ||= -1 } ;
if (@version > $record) {
$here = . ;
perl { $h->{$ref} = $here } ;
}
}
} ;
create STATEMENT ;
for my $trade in { values %$h } mv $trade into STATEMENT ;
save :f output2.xml ;
在 MSWin 上,您必须使用 NUL
而不是 /dev/null
。该程序仍然可能很耗内存——它需要记住整个输出。如果它太多了,你必须改变它来处理文件两次:在第一个 运行 中,它会记住每个 ref 的最大版本,在第二个 运行 中,它会输出。
$h = { {} } ;
stream :f file.xml :F /dev/null select TRADE {
rm TRADE ;
if (@origin='BANK'
and EVENT[@type='PRO'][EVENTNAR[@key='USE'][@val='MY']]
) {
$ref = @ref ;
$record = xsh:lookup('h', $ref) ;
perl { $record ||= -1 } ;
if (@version > $record) {
$record = @version ;
perl { $h->{$ref} = $record } ;
}
}
} ;
stream :f file.xml :F output3.xml select TRADE {
rm TRADE ;
if not(@origin = 'BANK'
and EVENT[@type='PRO'][EVENTNAR[@key='USE'][@val='MY']]
and xsh:lookup('h', @ref) = @version
) rm . ;
} ;
如果version+ref的组合是唯一的,可以简化最后的条件if not
。