PHP 解析 XML 具有许多名称空间的响应

Question

有没有办法解析 PHP 中的 XML 响应，同时考虑所有命名空间节点并将其转换为对象或数组，而无需知道所有节点名称？

例如，转换为：

<?xml version="1.0" encoding="ISO-8859-1"?>
<serv:message xmlns:serv="http://www.webex.com/schemas/2002/06/service"
    xmlns:com="http://www.webex.com/schemas/2002/06/common"
    xmlns:att="http://www.webex.com/schemas/2002/06/service/attendee">
    <serv:header>
        <serv:response>
            <serv:result>SUCCESS</serv:result>
            <serv:gsbStatus>PRIMARY</serv:gsbStatus>
        </serv:response>
    </serv:header>
    <serv:body>
        <serv:bodyContent xsi:type="att:lstMeetingAttendeeResponse"
            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
            <att:attendee>
                <att:person>
                    <com:name>James Kirk</com:name>
                    <com:firstName>James</com:firstName>
                    <com:lastName>Kirk</com:lastName>
                    <com:address>
                        <com:addressType>PERSONAL</com:addressType>
                    </com:address>
                    <com:phones />
                    <com:email>Jkirk@sz.webex.com</com:email>
                    <com:type>VISITOR</com:type>
                </att:person>
                <att:contactID>28410622</att:contactID>
                <att:joinStatus>INVITE</att:joinStatus>
                <att:meetingKey>803754412</att:meetingKey>
            </att:attendee>
        </serv:bodyContent>
    </serv:body>
</serv:message>

类似于：

['message' => [
    'header' => [
        'response' => [
            'result' => 'SUCCESS',
            'gsbStatus' => 'PRIMARY'
        ]
    ],
    'body' => [
        'bodyContent' => [
            'attendee' => [
                'person' => [
                    'name' => 'James Kirk',
                    'firstName' => 'James',
                    ...
                ],
                'contactID' => 28410622,
                ...
            ]
        ]
    ]
]

我知道使用非命名空间节点很容易，但我不知道从哪里开始做这样的事情。

Answer 1

不要对数组进行泛型转换。只需加载并阅读它。如果您使用 DOM+XPath，这并不难。

通用转换意味着您丢失了信息（命名空间）和功能（XPath）。

首先创建一个DOM并加载XML:

$dom = new DOMDocument();
$dom->loadXml($xml);

现在为 DOM 创建一个 DOMXPath 实例并为命名空间注册前缀。这可以是 XML 文档中的前缀或不同的前缀。

$xpath = new DOMXPath($dom);
$xpath->registerNamespace('serv', 'http://www.webex.com/schemas/2002/06/service');
$xpath->registerNamespace('com', 'http://www.webex.com/schemas/2002/06/common');
$xpath->registerNamespace('att', 'http://www.webex.com/schemas/2002/06/service/attendee');

使用 XPath 表达式中注册的前缀来获取值和节点：

var_dump(
  $xpath->evaluate('string(/serv:message/serv:header/serv:response/serv:result)')
);

输出：

string(7) "SUCCESS"

获取所有 attendee 元素并输出名称：

foreach ($xpath->evaluate('/serv:message/serv:body/serv:bodyContent/att:attendee') as $attendee) {
  var_dump(
   $xpath->evaluate('string(att:person/com:name)', $attendee)
  );
};

输出：

string(10) "James Kirk"

Answer 2

（阅读@ThW 关于为什么数组实际上不是那么重要的答案）

I know it's easy with non-namespaced nodes, but I don't know where to begin on something like this.

它与命名空间节点一样简单，因为从技术上讲它们是相同的。让我们举一个简单的例子，下面的脚本循环遍历文档中的所有元素，而不考虑命名空间：

$result = $xml->xpath('//*');
foreach ($result as $element) {
    $depth = count($element->xpath('./ancestor::*'));
    $indent = str_repeat('  ', $depth);
    printf("%s %s\n", $indent, $element->getName());
}

你的输出是：

 message
   header
     response
       result
       gsbStatus
   body
     bodyContent
       attendee
         person
           name
           firstName
           lastName
           address
             addressType
           phones
           email
           type
         contactID
         joinStatus
         meetingKey

如您所见，您可以遍历所有元素，就好像它们根本没有任何命名空间一样。

但如前所述，当您忽略命名空间时，您也会丢失重要信息。例如，对于您拥有的文档，您实际上对 attendee 和 common 元素感兴趣，service 元素处理传输：

$uriAtt = 'http://www.webex.com/schemas/2002/06/service/attendee';
$xml->registerXPathNamespace('att', $uriAtt);

$uriCom = 'http://www.webex.com/schemas/2002/06/common';
$xml->registerXPathNamespace('com', $uriCom);

$result = $xml->xpath('//att:*|//com:*');
foreach ($result as $element) {
    $depth  = count($element->xpath("./ancestor::*[namespace-uri(.) = '$uriAtt' or namespace-uri(.) = '$uriCom']"));
    $indent = str_repeat('  ', $depth);
    printf("%s %s\n", $indent, $element->getName());
}

本次示范输出：

 attendee
   person
     name
     firstName
     lastName
     address
       addressType
     phones
     email
     type
   contactID
   joinStatus
   meetingKey

那么为什么要删除所有名称空间？它们帮你获取你感兴趣的元素，你也可以动态获取

PHP 解析 XML 具有许多名称空间的响应

PHP Parse XML response with many namespaces

php

xml

parsing

simplexml

xml-namespaces