在 perl 中解析 xml

Parsing xml in perl

我想使用 Perl 解析这个 xml。我在这里展示的 XML 只是更大的嵌套 XML 的一部分。我尝试过使用普通的解析器,其中大多数以难以读取和访问子节点的哈希格式提供输出。

我想获取元素并读取所有属性值。

<?xml version="1.0" encoding="utf-8" standalone="no"?>
<TR name="App.exe" total="573" errors="1" failures="2" not-run="4" inconclusive="2" ignored="4" skipped="0" invalid="0" date="2015-01-12" time="17:43:59">
  <environment version="2" cversion="44" os-version="Microsoft" platform="Win32NT" cwd="" machine-name="" user="me" user-domain="domain" />
  <culture-info current-culture="en-US" current-uiculture="en-US" />
  <TS type="Assembly" name="App.exe" executed="True" result="Failure" success="False" time="22" asserts="0">
    <RS>
      <TS type="Namespace" name="MyAPP" executed="True" result="Failure" success="False" time="2335.164" asserts="0">
        <RS>
          <TS type="Namespace" name="Project" executed="True" result="Failure" success="False" time="2335.164" asserts="0">
            <RS>
              <TS type="Namespace" name="Website" executed="True" result="Failure" success="False" time="2335.164" asserts="0">
                <RS>
                  <TS type="Namespace" name="Service" executed="True" result="Failure" success="False" time="2335.163" asserts="0">
                    <RS>
                      <TS type="SetUpFixture" name="Tests" executed="True" result="Failure" success="False" time="2335.163" asserts="0">
                        <RS>
                          <TS type="Namespace" name="tempt" executed="True" result="Success" success="True" time="8.935" asserts="0">
                            <RS>
                              <TS type="ParameterizedFixture" name="TempAPI" executed="True" result="Success" success="True" time="8.935" asserts="0">
                                <RS>
                                  <TS type="TestFixture" name="Admin" executed="True" result="Success" success="True" time="3.306" asserts="2">
                                    <RS>
                                      <TC name="testName1" executed="True" result="Success" success="True" time="0.352" asserts="0" />
                                      <TC name="testName2" executed="True" result="Success" success="True" time="0.005" asserts="0" />
                                    </RS>
                                  </TS>
                                  <TS type="TestFixture" name="Client" executed="True" result="Success" success="True" time="2.620" asserts="1">
                                    <RS>
                                      <TC name="testName3" executed="True" result="Success" success="True" time="0.319" asserts="0" />
                                      <TC name="testName4" executed="True" result="Success" success="True" time="0.000" asserts="0" />
                                    </RS>
                                  </TS>
                                  <TS type="TestFixture" name="Employee" executed="True" result="Success" success="True" time="3.007" asserts="1">
                                    <RS>
                                      <TC name="testName5" executed="True" result="Success" success="True" time="0.290" asserts="0" />
                                      <TC name="testName6" executed="True" result="Success" success="True" time="0.000" asserts="0" />
                                    </RS>
                                  </TS>
                                </RS>
                              </TS>
                            </RS>
                          </TS>
                        </RS>
                      </TS>
                    </RS>
                  </TS>
                </RS>
              </TS>
            </RS>
          </TS>
        </RS>
      </TS>
    </RS>
  </TS>
</TR>

我试过这样做,正如我所说,这将提供难以读取和获取详细信息的哈希输出。

my $list = XMLin('F:\Sample.xml', KeepRoot => 1);

#print $list-->{TS}[0]{name};
print Dumper($list );
write_file 'F:\mydump.log', Dumper($list);

我需要有关可以输出比散列更易于阅读的格式的解析器的建议。

有了这个 XML::Simple 我得到了以下格式

$VAR1 = {
          'TR' => {
                  'failures' => '2',
                  'TS' => {
                          'asserts' => '0',
                          'success' => 'False',
                          'time' => '22',
                          'name' => 'App.exe',
                          'executed' => 'True',
                          'type' => 'Assembly',
                          'RS' => {
                                  'TS' => {
                                          'asserts' => '0',
                                          'success' => 'False',
                                          'time' => '2335.164',
                                          'name' => 'MyAPP',
                                          'executed' => 'True',
                                          'type' => 'Namespace',
                                          'RS' => {
                                                  'TS' => {
                                                          'asserts' => '0',
                                                          'success' => 'False',
                                                          'time' => '2335.164',
                                                          'name' => 'Project',
                                                          'executed' => 'True',
                                                          'type' => 'Namespace',
                                                          'RS' => {
                                                                  'TS' => {
                                                                          'asserts' => '0',
                                                                          'success' => 'False',
                                                                          'time' => '2335.164',
                                                                          'name' => 'Web',
                                                                          'executed' => 'True',
                                                                          'type' => 'Namespace',
                                                                          'RS' => {
                                                                                  'TS' => {
                                                                                          'asserts' => '0',
                                                                                          'success' => 'False',
                                                                                          'time' => '2335.163',
                                                                                          'name' => 'Server',
                                                                                          'executed' => 'True',
                                                                                          'type' => 'Namespace',
                                                                                          'RS' => {
                                                                                                  'TS' => {
                                                                                                          'asserts' => '0',
                                                                                                          'success' => 'False',
                                                                                                          'time' => '2335.163',
                                                                                                          'name' => 'Tests',

                                                                                                                                                          'Client' => {
                                                                                                                                                                      'success' => 'True',
                                                                                                                                                                      'asserts' => '1',
                                                                                                                                                                      'time' => '2.620',
                                                                                                                                                                      'executed' => 'True',
                                                                                                                                                                      'type' => 'TestFixture',
                                                                                                                                                                      'RS' => {
                                                                                                                                                                              'TC' => {
                                                                                                                                                                                      'testName3' => {
                                                                                                                                                                                                     'success' => 'True',
                                                                                                                                                                                                     'asserts' => '0',
                                                                                                                                                                                                     'time' => '0.319',
                                                                                                                                                                                                     'executed' => 'True',
                                                                                                                                                                                                     'result' => 'Success'
                                                                                                                                                                                                   },
                                                                                                                                                                                      'testName4' => {
                                                                                                                                                                                                     'success' => 'True',
                                                                                                                                                                                                     'asserts' => '0',
                                                                                                                                                                                                     'time' => '0.000',
                                                                                                                                                                                                     'executed' => 'True',
                                                                                                                                                                                                     'result' => 'Success'
                                                                                                                                                                                                   }
                                                                                                                                                                                    }
                                                                                                                                                                            },
                                                                                                                                                                      'result' => 'Success'
                                                                                                                                                                    },
                                                                                                                                                          'Admin' => {
                                                                                                                                                                     'success' => 'True',
                                                                                                                                                                     'asserts' => '2',
                                                                                                                                                                     'time' => '3.306',
                                                                                                                                                                     'executed' => 'True',
                                                                                                                                                                     'type' => 'TestFixture',
                                                                                                                                                                     'RS' => {
                                                                                                                                                                             'TC' => {
                                                                                                                                                                                     'testName1' => {
                                                                                                                                                                                                    'success' => 'True',
                                                                                                                                                                                                    'asserts' => '0',
                                                                                                                                                                                                    'time' => '0.352',
                                                                                                                                                                                                    'executed' => 'True',
                                                                                                                                                                                                    'result' => 'Success'
                                                                                                                                                                                                  },
                                                                                                                                                                                     'testName2' => {
                                                                                                                                                                                                    'success' => 'True',
                                                                                                                                                                                                    'asserts' => '0',
                                                                                                                                                                                                    'time' => '0.005',
                                                                                                                                                                                                    'executed' => 'True',
                                                                                                                                                                                                    'result' => 'Success'
                                                                                                                                                                                                  }
                                                                                                                                                                                   }
                                                                                                                                                                           },
                                                                                                                                                                     'result' => 'Success'
                                                                                                                                                                   }
                                                                                                                                                        }
                                                                                                                                                },
                                                                                                                                          'result' => 'Success'
                                                                                                                                        }
                                                                                                                                },
                                                                                                                          'result' => 'Success'
                                                                                                                        }
                                                                                                                },
                                                                                                          'result' => 'Failure'
                                                                                                        }
                                                                                                },
                                                                                          'result' => 'Failure'
                                                                                        }
                                                                                },
                                                                          'result' => 'Failure'
                                                                        }
                                                                },
                                                          'result' => 'Failure'
                                                        }
                                                },
                                          'result' => 'Failure'
                                        }
                                },
                          'result' => 'Failure'
                        },
                  'culture-info' => {
                                    'current-culture' => 'en-US',
                                    'current-uiculture' => 'en-US'
                                  },
                  'errors' => '1',
                  'time' => '17:43:59',
                  'date' => '2015-01-12',
                  'not-run' => '4',
                  'name' => 'App.exe',
                  'ignored' => '4',
                  'total' => '573',
                  'skipped' => '0',
                  'environment' => {
                                   'user-domain' => 'domain',
                                   'nunit-version' => '2.6.3.13283',
                                   'os-version' => 'Microsoft Windows NT 6.2.9200.0',
                                   'cwd' => '',
                                   'user' => 'me',
                                   'platform' => 'Win32NT',
                                   'clr-version' => '4.0.30319.34014',
                                   'machine-name' => ''
                                 },
                  'inconclusive' => '2',
                  'invalid' => '0'
                }
        };

根据评论,如果你只想要 TC 节点,你可以解析 XML 文件并遍历节点,如果节点标记为 TC,extracting/printing 你想要的信息。

或者,您可以在读取文件时使用正则表达式来捕获 TC 节点,然后提取您想要的信息。

使用 XML 解析器得到的是你丢弃的,这是你期望得到的,所以我不确定你到底期望什么。没有嵌套的扁平结构?

不要使用 XML::Simple。这是用词不当。一点都不简单,为了简单XML。

The use of this module in new code is discouraged.

试试 XML::Twig

您的部分问题很简单 - 您有一个深层嵌套的 XML 结构。 'display' 的方法有限。

但是几乎 every XML 解析器所做的是 - 将您的 XML 转换为 perl 数据结构 - 这通常是一个散列。但它通常也会做的,是让你 print 结构回到 'proper' XML.

因此,对于一个简单的重新格式化任务,XML::Twig 会让您:

#!/usr/bin/perl
use strict;
use warnings;

use XML::Twig;

sub handle_tc {
    my ( $twig, $tc ) = @_;   
    foreach my $attr ( keys %{ $tc -> atts() } ) {
        print "$attr = ".$tc->att($attr)."\n";
    }
    print "\n"; 
}

my $twig_parser = XML::Twig->new(
    pretty_print  => 'indented',
    twig_handlers => { 'TC' => \&handle_tc },
)->parsefile('F:\mydump.log');


print "\n\nWhole XML pretty_print\n\n"; 
$twig_parser->print;

这将打印 'TS' 元素的每个 'name' 属性。每次解析器遇到 TS 元素时,都会使用该 XML 子集调用处理程序。

为了比较,$twig_parser -> print会根据'pretty_print'选项重新格式化,输出。 (但考虑到您的来源 XML,可能不会改变太多)。