使用正则表达式拆分元类型数据
Split Meta Type Data With A Regex
我有一个数组存储了这样的数据:
<WebPage>
<Action>Action Goes Here 1</Action>
<SystemData>SystemData Goes Here 1</SystemData>
<PageSatausData>PageSatausData Goes Here 1</PageSatausData>
<PageNameData>PageNameData Goes Here 1</PageNameData>
<TitleData>TitleData Goes Here 1</TitleData>
<KeywordData>KeywordData Goes Here 1</KeywordData>
<DescriptionData>DescriptionData Goes Here 1</DescriptionData>
<HeaderData>HeaderData Goes Here 1</HeaderData>
<BodyData>BodyData Goes Here 1</BodyData>
<FooterData>FooterData Goes Here 1</FooterData>
</WebPage>
<WebPage>
<Action>Action Goes Here 2</Action>
<SystemData>SystemData Goes Here 2</SystemData>
<PageSatausData>PageSatausData Goes Here 2</PageSatausData>
<PageNameData>PageNameData Goes Here 2</PageNameData>
<TitleData>TitleData Goes Here 2</TitleData>
<KeywordData>KeywordData Goes Here 2</KeywordData>
<DescriptionData>DescriptionData Goes Here 2</DescriptionData>
<HeaderData>HeaderData Goes Here 2</HeaderData>
<BodyData>BodyData Goes Here 2</BodyData>
<FooterData>FooterData Goes Here 2</FooterData>
</WebPage>
我想做的是循环它并为每个值分配变量,如下所示:
foreach my $Line (@Meta_Content) {
my($Var1,$Var2,$Var3,$Var4,$Var5,$Var6,$Var7,$Var8,$Var9,$Var10) = split (/\>\</,$Line,10);
print "Result: $Var1,$Var2,$Var3,$Var4,$Var5,$Var6,$Var7,$Var8,$Var9,$Var10<br>";
}
不幸的是,我知道 XML 模块,但在这种情况下,我需要一个正则表达式来执行,所以模块不是一个选项。
这里是
#!/usr/bin/perl
use strict; use warnings; use Data::Dumper;
my $hash;
while (<DATA>) {
if ( /<WebPage>/ ) {
$hash={}
}
elsif ( /<\/WebPage>/ ) {
print Dumper $hash
}
elsif ( /^<(.+)>(.+)<\/>\s*/ ) {
$hash->{}=
}
}
__DATA__
<WebPage>
<Action>Action Goes Here 1</Action>
<SystemData>SystemData Goes Here 1</SystemData>
<PageSatausData>PageSatausData Goes Here 1</PageSatausData>
<PageNameData>PageNameData Goes Here 1</PageNameData>
<TitleData>TitleData Goes Here 1</TitleData>
<KeywordData>KeywordData Goes Here 1</KeywordData>
<DescriptionData>DescriptionData Goes Here 1</DescriptionData>
<HeaderData>HeaderData Goes Here 1</HeaderData>
<BodyData>BodyData Goes Here 1</BodyData>
<FooterData>FooterData Goes Here 1</FooterData>
</WebPage>
<WebPage>
<Action>Action Goes Here 2</Action>
<SystemData>SystemData Goes Here 2</SystemData>
<PageSatausData>PageSatausData Goes Here 2</PageSatausData>
<PageNameData>PageNameData Goes Here 2</PageNameData>
<TitleData>TitleData Goes Here 2</TitleData>
<KeywordData>KeywordData Goes Here 2</KeywordData>
<DescriptionData>DescriptionData Goes Here 2</DescriptionData>
<HeaderData>HeaderData Goes Here 2</HeaderData>
<BodyData>BodyData Goes Here 2</BodyData>
<FooterData>FooterData Goes Here 2</FooterData>
</WebPage>
而不是这个:
my($Var1,$Var2,$Var3,$Var4,$Var5,$Var6,$Var7,$Var8,$Var9,$Var10) = split (/\>\</,$Line,10);
print "Result: $Var1,$Var2,$Var3,$Var4,$Var5,$Var6,$Var7,$Var8,$Var9,$Var10<br>";
}
你可以这样写:
my @pieces = split (/\>\</,$Line,10);
my $str = join '', @pieces;
print "Results: $str <br>";
而如果你需要引用个别项目,而不是写$var1,你可以写$pieces[0];而不是写 $var2,你可以写 $pieces[1],等等
看看这有多简洁?每种语言的初学者都可以尝试您所做的。规则是:如果您发现自己编写的变量名仅相差一个数字,那么您应该将数据存储在数组中。
我有一个数组存储了这样的数据:
<WebPage>
<Action>Action Goes Here 1</Action>
<SystemData>SystemData Goes Here 1</SystemData>
<PageSatausData>PageSatausData Goes Here 1</PageSatausData>
<PageNameData>PageNameData Goes Here 1</PageNameData>
<TitleData>TitleData Goes Here 1</TitleData>
<KeywordData>KeywordData Goes Here 1</KeywordData>
<DescriptionData>DescriptionData Goes Here 1</DescriptionData>
<HeaderData>HeaderData Goes Here 1</HeaderData>
<BodyData>BodyData Goes Here 1</BodyData>
<FooterData>FooterData Goes Here 1</FooterData>
</WebPage>
<WebPage>
<Action>Action Goes Here 2</Action>
<SystemData>SystemData Goes Here 2</SystemData>
<PageSatausData>PageSatausData Goes Here 2</PageSatausData>
<PageNameData>PageNameData Goes Here 2</PageNameData>
<TitleData>TitleData Goes Here 2</TitleData>
<KeywordData>KeywordData Goes Here 2</KeywordData>
<DescriptionData>DescriptionData Goes Here 2</DescriptionData>
<HeaderData>HeaderData Goes Here 2</HeaderData>
<BodyData>BodyData Goes Here 2</BodyData>
<FooterData>FooterData Goes Here 2</FooterData>
</WebPage>
我想做的是循环它并为每个值分配变量,如下所示:
foreach my $Line (@Meta_Content) {
my($Var1,$Var2,$Var3,$Var4,$Var5,$Var6,$Var7,$Var8,$Var9,$Var10) = split (/\>\</,$Line,10);
print "Result: $Var1,$Var2,$Var3,$Var4,$Var5,$Var6,$Var7,$Var8,$Var9,$Var10<br>";
}
不幸的是,我知道 XML 模块,但在这种情况下,我需要一个正则表达式来执行,所以模块不是一个选项。
这里是
#!/usr/bin/perl
use strict; use warnings; use Data::Dumper;
my $hash;
while (<DATA>) {
if ( /<WebPage>/ ) {
$hash={}
}
elsif ( /<\/WebPage>/ ) {
print Dumper $hash
}
elsif ( /^<(.+)>(.+)<\/>\s*/ ) {
$hash->{}=
}
}
__DATA__
<WebPage>
<Action>Action Goes Here 1</Action>
<SystemData>SystemData Goes Here 1</SystemData>
<PageSatausData>PageSatausData Goes Here 1</PageSatausData>
<PageNameData>PageNameData Goes Here 1</PageNameData>
<TitleData>TitleData Goes Here 1</TitleData>
<KeywordData>KeywordData Goes Here 1</KeywordData>
<DescriptionData>DescriptionData Goes Here 1</DescriptionData>
<HeaderData>HeaderData Goes Here 1</HeaderData>
<BodyData>BodyData Goes Here 1</BodyData>
<FooterData>FooterData Goes Here 1</FooterData>
</WebPage>
<WebPage>
<Action>Action Goes Here 2</Action>
<SystemData>SystemData Goes Here 2</SystemData>
<PageSatausData>PageSatausData Goes Here 2</PageSatausData>
<PageNameData>PageNameData Goes Here 2</PageNameData>
<TitleData>TitleData Goes Here 2</TitleData>
<KeywordData>KeywordData Goes Here 2</KeywordData>
<DescriptionData>DescriptionData Goes Here 2</DescriptionData>
<HeaderData>HeaderData Goes Here 2</HeaderData>
<BodyData>BodyData Goes Here 2</BodyData>
<FooterData>FooterData Goes Here 2</FooterData>
</WebPage>
而不是这个:
my($Var1,$Var2,$Var3,$Var4,$Var5,$Var6,$Var7,$Var8,$Var9,$Var10) = split (/\>\</,$Line,10);
print "Result: $Var1,$Var2,$Var3,$Var4,$Var5,$Var6,$Var7,$Var8,$Var9,$Var10<br>";
}
你可以这样写:
my @pieces = split (/\>\</,$Line,10);
my $str = join '', @pieces;
print "Results: $str <br>";
而如果你需要引用个别项目,而不是写$var1,你可以写$pieces[0];而不是写 $var2,你可以写 $pieces[1],等等
看看这有多简洁?每种语言的初学者都可以尝试您所做的。规则是:如果您发现自己编写的变量名仅相差一个数字,那么您应该将数据存储在数组中。