用 PHP 阅读令人讨厌的 XML

Reading nasty XML with PHP

我正在学习 PHP 并且我正在努力处理这个 XML 文件。
我想知道如何读取它并从中获取一些数据。

谁能给我解释一下:

| - /data/         <- all zip files are here
| - myscript.php   <- with the tag zip:// I operate the zip file without need to open but cant access other folders
| - index.html     <- my form
| - styles.css     <- format

myscript.php

if (file_exists($filename.'.zip')) {
    $xml = simplexml_load_file('zip://'.$filename.'.zip'.'#'.$filename.'.xml');
    if ($xml === false) {
        die('Error opening zip');
    }
    $register = $xml->getDocNamespaces(TRUE);
    foreach ($register as $i) {
        $i->registerXPathNamespace();
    }
    //the following lines does not work
    $name = $xml->{'cac:AccountingCustomerParty'}->{'cac:Party'}->{'cac:PartyIdentification'}->{'cbc:ID'};
    $name2 = $xml->{'cac:LegalMonetaryTotal'}->{'cbc:PayableAmount'};
    // And I don't know why
    echo $name, $name2;
} else {
    die('File not found');
}

filename.xml

<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>
<Invoice xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2" xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" xmlns:ccts="urn:un:unece:uncefact:documentation:2" xmlns:ds="http://www.w3.org/2000/09/xmldsig#" xmlns:ext="urn:oasis:names:specification:ubl:schema:xsd:CommonExtensionComponents-2" xmlns:qdt="urn:oasis:names:specification:ubl:schema:xsd:QualifiedDatatypes-2" xmlns:udt="urn:un:unece:uncefact:data:specification:UnqualifiedDataTypesSchemaModule:2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <ext:UBLExtensions>
        <ext:UBLExtension>
            <ext:ExtensionContent>
            <ds:Signature Id="SignatureSP"><SignedInfo xmlns="http://www.w3.org/2000/09/xmldsig#"><CanonicalizationMethod Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315" /><SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1" /><Reference URI=""><Transforms><Transform Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-signature" /></Transforms><DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1" /><DigestValue>code</DigestValue></Reference></SignedInfo><SignatureValue xmlns="http://www.w3.org/2000/09/xmldsig#">something</SignatureValue><KeyInfo xmlns="http://www.w3.org/2000/09/xmldsig#"><X509Data><X509SubjectName>something</X509SubjectName><X509Certificate>abc</X509Certificate></X509Data></KeyInfo></ds:Signature></ext:ExtensionContent>
        </ext:UBLExtension>
    </ext:UBLExtensions>
    <cbc:UBLVersionID>2.1</cbc:UBLVersionID>
    <cbc:CustomizationID schemeAgencyName="PE:SUNAT">2.0</cbc:CustomizationID>
    <cbc:ProfileID schemeName="Tipo de Operacion" schemeAgencyName="PE:SUNAT" schemeURI="urn:pe:gob:sunat:cpe:see:gem:catalogos:catalogo51">0101</cbc:ProfileID>
    <cbc:ID>F002-0006068</cbc:ID>
    <cbc:IssueDate>2020-08-31</cbc:IssueDate>
    <cbc:IssueTime>13:10:29</cbc:IssueTime>
    <cbc:DueDate>2020-08-31</cbc:DueDate>
    <cbc:InvoiceTypeCode listID="0101" listAgencyName="PE:SUNAT" listName="Tipo de Documento" listURI="urn:pe:gob:sunat:cpe:see:gem:catalogos:catalogo01">01</cbc:InvoiceTypeCode>
    <cbc:Note languageLocaleID="1000">UN MIL NOVECIENTOS OCHENTA Y OCHO CON 00/100 Dólares Americanos       </cbc:Note>
    <cbc:DocumentCurrencyCode listID="ISO 4217 Alpha" listName="Currency" listAgencyName="United Nations Economic Commission for Europe">USD</cbc:DocumentCurrencyCode>
    <cbc:LineCountNumeric>1</cbc:LineCountNumeric>
 <cac:Signature>
  <cbc:ID>SFF002-0006068</cbc:ID>
  <cac:SignatoryParty>
   <cac:PartyIdentification>
    <cbc:ID schemeID="6" schemeName="Documento de Identidad" schemeAgencyName="PE:SUNAT" schemeURI="urn:pe:gob:sunat:cpe:see:gem:catalogos:catalogo06">20228938824</cbc:ID>
   </cac:PartyIdentification>
   <cac:PartyName>
    <cbc:Name><![CDATA[GOMEZ R.L.]]></cbc:Name>
   </cac:PartyName>
  </cac:SignatoryParty>
  <cac:DigitalSignatureAttachment>
   <cac:ExternalReference>
   <cbc:URI>#SFF002-0006068</cbc:URI>
  </cac:ExternalReference>
  </cac:DigitalSignatureAttachment>
 </cac:Signature>
    <cac:AccountingSupplierParty>
        <cac:Party>
            <cac:PartyIdentification>
                <cbc:ID schemeID="6" schemeName="Documento de Identidad" schemeAgencyName="PE:SUNAT" schemeURI="urn:pe:gob:sunat:cpe:see:gem:catalogos:catalogo06">20228938824</cbc:ID>
            </cac:PartyIdentification>
            <cac:PartyName>
                <cbc:Name><![CDATA[GOMEZ R.L.]]></cbc:Name>
            </cac:PartyName>
            <cac:PartyTaxScheme>
                <cbc:RegistrationName><![CDATA[GOMEZ R.L.]]></cbc:RegistrationName>
                <cbc:CompanyID schemeID="6" schemeName="SUNAT:Identificador de Documento de Identidad" schemeAgencyName="PE:SUNAT" schemeURI="urn:pe:gob:sunat:cpe:see:gem:catalogos:catalogo06">20228931124</cbc:CompanyID>
                <cac:TaxScheme>
                    <cbc:ID schemeID="6" schemeName="SUNAT:Identificador de Documento de Identidad" schemeAgencyName="PE:SUNAT" schemeURI="urn:pe:gob:sunat:cpe:see:gem:catalogos:catalogo06">20228931124</cbc:ID>
                </cac:TaxScheme>
            </cac:PartyTaxScheme>
            <cac:PartyLegalEntity>
                <cbc:RegistrationName><![CDATA[GOMEZ R.L.]]></cbc:RegistrationName>
                <cac:RegistrationAddress>
                    <cbc:ID schemeName="Ubigeos" schemeAgencyName="PE:INEI">040104</cbc:ID>
                    <cbc:AddressTypeCode listAgencyName="PE:SUNAT" listName="Establecimientos anexos">0000</cbc:AddressTypeCode>
                    <cbc:CityName><![CDATA[Colorado]]></cbc:CityName>
                    <cbc:District><![CDATA[Colorado]]></cbc:District>
                    <cac:AddressLine>
                        <cbc:Line><![CDATA[VIA]]></cbc:Line>
                    </cac:AddressLine>
                    <cac:Country>
                        <cbc:IdentificationCode listID="ISO 3166-1" listAgencyName="United Nations Economic Commission for Europe" listName="Country">PE</cbc:IdentificationCode>
                    </cac:Country>
                </cac:RegistrationAddress>
            </cac:PartyLegalEntity>
        </cac:Party>
    </cac:AccountingSupplierParty>
    <cac:AccountingCustomerParty>
        <cac:Party>
            <cac:PartyIdentification>
                <cbc:ID schemeID="6" schemeName="Documento de Identidad" schemeAgencyName="PE:SUNAT" schemeURI="urn:pe:gob:sunat:cpe:see:gem:catalogos:catalogo06">20455870969</cbc:ID>
            </cac:PartyIdentification>
            <cac:PartyName>
                <cbc:Name><![CDATA[AGZ]]></cbc:Name>
            </cac:PartyName>
            <cac:PartyTaxScheme>
                <cbc:RegistrationName><![CDATA[AGZ]]></cbc:RegistrationName>
                <cbc:CompanyID schemeID="6" schemeName="SUNAT:Identificador de Documento de Identidad" schemeAgencyName="PE:SUNAT" schemeURI="urn:pe:gob:sunat:cpe:see:gem:catalogos:catalogo06">20455870969</cbc:CompanyID>
                <cac:TaxScheme>
                    <cbc:ID schemeID="6" schemeName="SUNAT:Identificador de Documento de Identidad" schemeAgencyName="PE:SUNAT" schemeURI="urn:pe:gob:sunat:cpe:see:gem:catalogos:catalogo06">20455870969</cbc:ID>
                </cac:TaxScheme>
            </cac:PartyTaxScheme>
            <cac:PartyLegalEntity>
                <cbc:RegistrationName><![CDATA[AGZ]]></cbc:RegistrationName>
                <cac:RegistrationAddress>
                    <cac:AddressLine>
                        <cbc:Line><![CDATA[VIA]]></cbc:Line>
                    </cac:AddressLine>
                    <cac:Country>
                        <cbc:IdentificationCode listID="ISO 3166-1" listAgencyName="United Nations Economic Commission for Europe" listName="Country">PE</cbc:IdentificationCode>
                    </cac:Country>
                </cac:RegistrationAddress>
            </cac:PartyLegalEntity>
        </cac:Party>
    </cac:AccountingCustomerParty>
    <cac:TaxTotal>
        <cbc:TaxAmount currencyID="USD">0.00</cbc:TaxAmount>
        <cac:TaxSubtotal>
            <cbc:TaxableAmount currencyID="USD">1988.00</cbc:TaxableAmount>
            <cbc:TaxAmount currencyID="USD">0.00</cbc:TaxAmount>
            <cac:TaxCategory>
                <cbc:ID schemeID="UN/ECE 5305" schemeName="Tax Category Identifier" schemeAgencyName="United Nations Economic Commission for Europe">O</cbc:ID>
                <cac:TaxScheme>
                    <cbc:ID schemeID="UN/ECE 5153" schemeAgencyID="6">9998</cbc:ID>
                    <cbc:Name>INA</cbc:Name>
                    <cbc:TaxTypeCode>FRE</cbc:TaxTypeCode>
                </cac:TaxScheme>
            </cac:TaxCategory>
        </cac:TaxSubtotal>
   </cac:TaxTotal>
   <cac:LegalMonetaryTotal>
        <cbc:LineExtensionAmount currencyID="USD">1988.00</cbc:LineExtensionAmount>
        <cbc:PayableAmount currencyID="USD">1988.00</cbc:PayableAmount>
   </cac:LegalMonetaryTotal>
 <cac:InvoiceLine>
        <cbc:ID>1</cbc:ID>
        <cbc:InvoicedQuantity unitCode="NIU" unitCodeListID="UN/ECE rec 20" unitCodeListAgencyName="United Nations Economic Commission for Europe">1.00</cbc:InvoicedQuantity>
        <cbc:LineExtensionAmount currencyID="USD">0.00</cbc:LineExtensionAmount>
        <cac:PricingReference>
            <cac:AlternativeConditionPrice>
                <cbc:PriceAmount currencyID="USD">1988.00</cbc:PriceAmount>
                <cbc:PriceTypeCode listName="Tipo de Precio" listAgencyName="PE:SUNAT" listURI="urn:pe:gob:sunat:cpe:see:gem:catalogos:catalogo16">01</cbc:PriceTypeCode>
            </cac:AlternativeConditionPrice>
        </cac:PricingReference>
        <cac:TaxTotal>
            <cbc:TaxAmount currencyID="USD">0.00</cbc:TaxAmount>
            <cac:TaxSubtotal>
                <cbc:TaxableAmount currencyID="USD">0.00</cbc:TaxableAmount>
                <cbc:TaxAmount currencyID="USD">0.00</cbc:TaxAmount>
                <cac:TaxCategory>
                    <cbc:ID schemeID="UN/ECE 5305" schemeName="Tax Category Identifier" schemeAgencyName="United Nations Economic Commission for Europe">O</cbc:ID>
                    <cbc:Percent>0.00</cbc:Percent>
                    <cbc:TaxExemptionReasonCode listAgencyName="PE:SUNAT" listName="Afectacion del IGV" listURI="urn:pe:gob:sunat:cpe:see:gem:catalogos:catalogo07">30</cbc:TaxExemptionReasonCode>
                    <cac:TaxScheme>
                        <cbc:ID schemeID="UN/ECE 5153" schemeAgencyID="6">9998</cbc:ID>
                        <cbc:Name>INA</cbc:Name>
                        <cbc:TaxTypeCode>FRE</cbc:TaxTypeCode>
                    </cac:TaxScheme>
                </cac:TaxCategory>
            </cac:TaxSubtotal>
        </cac:TaxTotal>
        <cac:Item>
            <cbc:Description><![CDATA[-]]></cbc:Description>
            <cac:SellersItemIdentification>
                <cbc:ID><![CDATA[000982]]></cbc:ID>
            </cac:SellersItemIdentification>
   <cac:CommodityClassification>
     <cbc:ItemClassificationCode listID="UNSPSC" listAgencyName="GS1 US" listName="Item Classification">78101806</cbc:ItemClassificationCode>
   </cac:CommodityClassification>
        </cac:Item>
        <cac:Price>
            <cbc:PriceAmount currencyID="USD">1988.00</cbc:PriceAmount>
        </cac:Price>
    </cac:InvoiceLine>
 </Invoice>

我扯头发了,虽然这很简单。非常感谢任何帮助

您需要做的就是像这样在文件名前添加路径:

$xml = simplexml_load_file('zip://data/'.$filename.'.zip'.'#'.$filename.'.xml');

zip:// 是来自 URI 的方案,data/filename.zip 是 ZIP 文件的路径。在 # 之后,您可以选择要在 ZIP 中访问的文件。

要访问数据,我建议您使用 XPath:

$name = $xml->xpath(
    '//cac:AccountingCustomerParty/cac:Party/cac:PartyIdentification/cbc:ID'
);

结果数据是一个元素数组,所以可以打印数据如下:

if (count($name) > 0) {
    echo 'c:ID = ', $name[0], PHP_EOL;
} else {
    echo 'No c:ID found', PHP_EOL;
}

不需要先用registerXPathNamespace()再用xpath()


您的完整代码可能如下所示:

<?php
$file = 'data/archivo.zip';
$code = 'archivo';
if (file_exists($file)) {
    $xml = simplexml_load_file('zip://'. $file .'#'. $code .'.xml');
    if ($xml === false) {
        die('Error opening ZIP');
    }
    $name = $xml->xpath(
        '//cac:AccountingCustomerParty/cac:Party/cac:PartyIdentification/cbc:ID'
    );
    $name2 = $xml->xpath(
        '//cac:LegalMonetaryTotal/cbc:PayableAmount'
    );
    if (count($name) > 0) {
        echo 'c:ID = ', $name[0], PHP_EOL;
    } else {
        echo 'No c:ID found', PHP_EOL;
    }
    if (count($name2) > 0) {
        echo 'cbc:PayableAmount = ', $name2[0], PHP_EOL;
    } else {
        echo 'No cbc:PayableAmount found', PHP_EOL;
    }
} else {
    die('File not found');
}

您可以在以下 github 存储库中看到我使用的代码: