XML 文档中的节点选择
Node Selection in XML Document
我正在处理一份非结构化 XML 文档,以便将其转换为结构化文档。非结构化文档如下所示
<?xml version="1.0" encoding="UTF-8"?>
<CustomerInformation>
<CustomerPurchaseID>String</CustomerPurchaseID>
<MemberAddress>String</MemberAddress>
<MemberID>String</MemberID>
<MemberCity>String</MemberCity>
<MemberName>String</MemberName>
<MemberType>String</MemberType>
<MemberState>String</MemberState>
<MemberSince>String</MemberSince>
<PurchaseDate>String</PurchaseDate>
<CreditCardName></CreditCardName>
<CreditCardExpirration></CreditCardExpirration>
<Orders>
<LineItemCode>String</LineItemCode>
<LineItemID>String</LineItemID>
<LineItemDescription>String</LineItemDescription>
<DiscountCode>String</DiscountCode>
</Orders>
<Orders>
<LineItemCode>String</LineItemCode>
<LineItemID>String</LineItemID>
<LineItemDescription>String</LineItemDescription>
<DiscountCode>String</DiscountCode>
</Orders>
<ShipToAddress>String</ShipToAddress>
<ShipToCity>String</ShipToCity>
<ShipToFirstName>String</ShipToFirstName>
<ShipToLastName>String</ShipToLastName>
<ShipToState>String</ShipToState>
<ShipToZIPCode>String</ShipToZIPCode>
<CustomerAddressLine1>String</CustomerAddressLine1>
<CustomerAddressLine2>String</CustomerAddressLine2>
<CustomerID>String</CustomerID>
<CustomerCity>String</CustomerCity>
<CustomerEmail>String</CustomerEmail>
<CustomerFirstName>String</CustomerFirstName>
<CustomerLastName>String</CustomerLastName>
<CustomerHomePhone>String</CustomerHomePhone>
<CustomerState>String</CustomerState>
<CustomerZIP>String</CustomerZIP>
<Status>String</Status>
<OrderedFromName>String</OrderedFromName>
<CustomerIdentification></CustomerIdentification>
<PrimaryCustomerIndicator>String</PrimaryCustomerIndicator>
<OrderedFromAddressLine1Text>String</OrderedFromAddressLine1Text>
<OrderedFromAddressLine2Text>String</OrderedFromAddressLine2Text>
<OrderedFromCityName>String</OrderedFromCityName>
<OrderedFromStateCode>String</OrderedFromStateCode>
<OrderedFromZip5Code>String</OrderedFromZip5Code>
<OrderedFromZip4Code>String</OrderedFromZip4Code>
</CustomerInformation>
要转换成这样:
<?xml version="1.0" encoding="UTF-8"?>
<xmlns:evt="http://www.metadata..com/Management/">
<Identifier>3442=000-MNNN</Identifier>
<TypeCode>Purchase History</TypeCode>
<TypeDescription>Order Summary</TypeDescription>
<PurposeCode>Invoice</PurposeCode>
<Member>
<Email>String</Email>
<MemberSince>03/23/2000</MemberSince>
<MemberType>
<MemberShipTypeCode>String</MemberShipTypeCode>
<TypeDescription>String</TypeDescription>
</MemberType>
<Address>
<AddressLine1Text>String</AddressLine1Text>
<AddressLine2Text>String</AddressLine2Text>
<CityName>String</CityName>
<StateCode>String</StateCode>
<Zip5Code>String</Zip5Code>
<Zip4Code>String</Zip4Code>
</Address>
<Telephone>
<AreaCode>String</AreaCode>
<TelephoneNumber>String</TelephoneNumber>
</Telephone>
</Member>
<Company>
<CompanyName>String</CompanyName>
<CustomerIdentification>0.0</CustomerIdentification>
<PrimaryCustomerIndicator>String</PrimaryCustomerIndicator>
<CompanyAddress>
<CompanyAddressLine1Text>String</CompanyAddressLine1Text>
<CompanyAddressLine2Text>String</CompanyAddressLine2Text>
<CompanyCityName>String</CompanyCityName>
<CompanyStateCode>String</CompanyStateCode>
<CompanyZip5Code>String</CompanyZip5Code>
<CompanyZip4Code>String</CompanyZip4Code>
</CompanyAddress>
</Company>
<Orders>
<CreditCard>
<CardName>String</CardName>
<CardExpirationDate>1967-08-13</CardExpirationDate>
</CreditCard>
<Order>
<Discount>String</Discount>
<ShippingVendorName>String</ShippingVendorName>
<ShipmentTrackingNumber>String</ShipmentTrackingNumber>
<ShipmentTrackingLinkText>String</ShipmentTrackingLinkText>
<CustomerName>String</CustomerName>
<CustomerEmailAddressText>String</CustomerEmailAddressText>
<Telephone>
<AreaCode>String</AreaCode>
<TelephoneNumber>String</TelephoneNumber>
</Telephone>
<ShippingAddress>
<ShippingAddressLine1Text>String</ShippingAddressLine1Text>
<ShippingAddressLine2Text>String</ShippingAddressLine2Text>
<ShippingCareOfText>String</ShippingCareOfText>
<ShippingCityName>String</ShippingCityName>
<ShippingStateCode>String</ShippingStateCode>
<ShippingZip5Code>String</ShippingZip5Code>
<ShippingZip4Code>String</ShippingZip4Code>
</ShippingAddress>
<LineItem>
<LineItemNumber>String</LineItemNumber>
<LineItemQuantityCount>0</LineItemQuantityCount>
<ItemOrderedIndicator>String</ItemOrderedIndicator>
<Discount>String</Discount>
</LineItem>
</Order>
</Orders>
我能够通过创建结构化格式并通过简单地使用节点值和下面的 XSLT 提取相关字段来生成 XML:
<xsl:value-of select=.../>
不过我觉得可能有更好的方法。我希望能够在浏览非结构化或平面文档时控制结构的生成方式。例如,有没有办法对所有 MemberAddress 字段的元素进行分组?如果我能够做到这一点,我就可以创建输出的成员部分。我也可以对其他元素做同样的事情。我担心对结构化文档进行硬编码是因为它将来可能会发生变化。如果可能的话,我希望能够控制输出。源文档中的所有成员信息都应映射到目标文档中的成员元素。源文档中以 OrderedFrom 开头的元素应映射到目标文档中的 Company 字段。 ShipTo 元素又应该映射到目标文档的订单部分中的运输信息,等等。请帮忙!!
My concern with hardcoding the structured document is that it may
change in the future.
XSLT 样式表将数据从一种 XML 模式转换为另一种模式。期望任一架构的更改都不需要重写样式表是不现实的。
Is there a way to group the elements for all MemberAddress fields for
example?
是的,如果你有办法识别它们的话。例如,您可以这样做:
<Member>
<xsl:for-each select="*[starts-with(name(), 'Member')]">
<xsl:element name="{substring-after(name(), 'Member')}">
<xsl:value-of select="." />
</xsl:element>
</xsl:for-each>
</Member>
得到:
<Member>
<Address>String</Address>
<ID>String</ID>
<City>String</City>
<Name>String</Name>
<Type>String</Type>
<State>String</State>
<Since>String</Since>
</Member>
但这不符合您的预期输出。顺便说一句,您的输出显示了很多输入中没有的数据,例如成员的 e-mail.
我正在处理一份非结构化 XML 文档,以便将其转换为结构化文档。非结构化文档如下所示
<?xml version="1.0" encoding="UTF-8"?>
<CustomerInformation>
<CustomerPurchaseID>String</CustomerPurchaseID>
<MemberAddress>String</MemberAddress>
<MemberID>String</MemberID>
<MemberCity>String</MemberCity>
<MemberName>String</MemberName>
<MemberType>String</MemberType>
<MemberState>String</MemberState>
<MemberSince>String</MemberSince>
<PurchaseDate>String</PurchaseDate>
<CreditCardName></CreditCardName>
<CreditCardExpirration></CreditCardExpirration>
<Orders>
<LineItemCode>String</LineItemCode>
<LineItemID>String</LineItemID>
<LineItemDescription>String</LineItemDescription>
<DiscountCode>String</DiscountCode>
</Orders>
<Orders>
<LineItemCode>String</LineItemCode>
<LineItemID>String</LineItemID>
<LineItemDescription>String</LineItemDescription>
<DiscountCode>String</DiscountCode>
</Orders>
<ShipToAddress>String</ShipToAddress>
<ShipToCity>String</ShipToCity>
<ShipToFirstName>String</ShipToFirstName>
<ShipToLastName>String</ShipToLastName>
<ShipToState>String</ShipToState>
<ShipToZIPCode>String</ShipToZIPCode>
<CustomerAddressLine1>String</CustomerAddressLine1>
<CustomerAddressLine2>String</CustomerAddressLine2>
<CustomerID>String</CustomerID>
<CustomerCity>String</CustomerCity>
<CustomerEmail>String</CustomerEmail>
<CustomerFirstName>String</CustomerFirstName>
<CustomerLastName>String</CustomerLastName>
<CustomerHomePhone>String</CustomerHomePhone>
<CustomerState>String</CustomerState>
<CustomerZIP>String</CustomerZIP>
<Status>String</Status>
<OrderedFromName>String</OrderedFromName>
<CustomerIdentification></CustomerIdentification>
<PrimaryCustomerIndicator>String</PrimaryCustomerIndicator>
<OrderedFromAddressLine1Text>String</OrderedFromAddressLine1Text>
<OrderedFromAddressLine2Text>String</OrderedFromAddressLine2Text>
<OrderedFromCityName>String</OrderedFromCityName>
<OrderedFromStateCode>String</OrderedFromStateCode>
<OrderedFromZip5Code>String</OrderedFromZip5Code>
<OrderedFromZip4Code>String</OrderedFromZip4Code>
</CustomerInformation>
要转换成这样:
<?xml version="1.0" encoding="UTF-8"?>
<xmlns:evt="http://www.metadata..com/Management/">
<Identifier>3442=000-MNNN</Identifier>
<TypeCode>Purchase History</TypeCode>
<TypeDescription>Order Summary</TypeDescription>
<PurposeCode>Invoice</PurposeCode>
<Member>
<Email>String</Email>
<MemberSince>03/23/2000</MemberSince>
<MemberType>
<MemberShipTypeCode>String</MemberShipTypeCode>
<TypeDescription>String</TypeDescription>
</MemberType>
<Address>
<AddressLine1Text>String</AddressLine1Text>
<AddressLine2Text>String</AddressLine2Text>
<CityName>String</CityName>
<StateCode>String</StateCode>
<Zip5Code>String</Zip5Code>
<Zip4Code>String</Zip4Code>
</Address>
<Telephone>
<AreaCode>String</AreaCode>
<TelephoneNumber>String</TelephoneNumber>
</Telephone>
</Member>
<Company>
<CompanyName>String</CompanyName>
<CustomerIdentification>0.0</CustomerIdentification>
<PrimaryCustomerIndicator>String</PrimaryCustomerIndicator>
<CompanyAddress>
<CompanyAddressLine1Text>String</CompanyAddressLine1Text>
<CompanyAddressLine2Text>String</CompanyAddressLine2Text>
<CompanyCityName>String</CompanyCityName>
<CompanyStateCode>String</CompanyStateCode>
<CompanyZip5Code>String</CompanyZip5Code>
<CompanyZip4Code>String</CompanyZip4Code>
</CompanyAddress>
</Company>
<Orders>
<CreditCard>
<CardName>String</CardName>
<CardExpirationDate>1967-08-13</CardExpirationDate>
</CreditCard>
<Order>
<Discount>String</Discount>
<ShippingVendorName>String</ShippingVendorName>
<ShipmentTrackingNumber>String</ShipmentTrackingNumber>
<ShipmentTrackingLinkText>String</ShipmentTrackingLinkText>
<CustomerName>String</CustomerName>
<CustomerEmailAddressText>String</CustomerEmailAddressText>
<Telephone>
<AreaCode>String</AreaCode>
<TelephoneNumber>String</TelephoneNumber>
</Telephone>
<ShippingAddress>
<ShippingAddressLine1Text>String</ShippingAddressLine1Text>
<ShippingAddressLine2Text>String</ShippingAddressLine2Text>
<ShippingCareOfText>String</ShippingCareOfText>
<ShippingCityName>String</ShippingCityName>
<ShippingStateCode>String</ShippingStateCode>
<ShippingZip5Code>String</ShippingZip5Code>
<ShippingZip4Code>String</ShippingZip4Code>
</ShippingAddress>
<LineItem>
<LineItemNumber>String</LineItemNumber>
<LineItemQuantityCount>0</LineItemQuantityCount>
<ItemOrderedIndicator>String</ItemOrderedIndicator>
<Discount>String</Discount>
</LineItem>
</Order>
</Orders>
我能够通过创建结构化格式并通过简单地使用节点值和下面的 XSLT 提取相关字段来生成 XML:
<xsl:value-of select=.../>
不过我觉得可能有更好的方法。我希望能够在浏览非结构化或平面文档时控制结构的生成方式。例如,有没有办法对所有 MemberAddress 字段的元素进行分组?如果我能够做到这一点,我就可以创建输出的成员部分。我也可以对其他元素做同样的事情。我担心对结构化文档进行硬编码是因为它将来可能会发生变化。如果可能的话,我希望能够控制输出。源文档中的所有成员信息都应映射到目标文档中的成员元素。源文档中以 OrderedFrom 开头的元素应映射到目标文档中的 Company 字段。 ShipTo 元素又应该映射到目标文档的订单部分中的运输信息,等等。请帮忙!!
My concern with hardcoding the structured document is that it may change in the future.
XSLT 样式表将数据从一种 XML 模式转换为另一种模式。期望任一架构的更改都不需要重写样式表是不现实的。
Is there a way to group the elements for all MemberAddress fields for example?
是的,如果你有办法识别它们的话。例如,您可以这样做:
<Member>
<xsl:for-each select="*[starts-with(name(), 'Member')]">
<xsl:element name="{substring-after(name(), 'Member')}">
<xsl:value-of select="." />
</xsl:element>
</xsl:for-each>
</Member>
得到:
<Member>
<Address>String</Address>
<ID>String</ID>
<City>String</City>
<Name>String</Name>
<Type>String</Type>
<State>String</State>
<Since>String</Since>
</Member>
但这不符合您的预期输出。顺便说一句,您的输出显示了很多输入中没有的数据,例如成员的 e-mail.