使用 powershell 将带有命名空间的 XML 转换为 CSV
Converting XML with namespaces to CSV using powershell
我有这个 XML 文件:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns3:BOX xmlns="urn:loc.gov:item"
xmlns:ns2="urn:loc.gov:box"
xmlns:ns3="http://www.example.com/inverter"
xmlns:ns4="urn:loc.gov:xyz">
<ns3:Item>
<Description>ITEM1</Description>
<PackSizeNumeric>6</PackSizeNumeric>
<ns2:BuyersItemIdentification>
<ID>75847589</ID>
</ns2:BuyersItemIdentification>
<ns2:CommodityClassification>
<CommodityCode>856952</CommodityCode>
</ns2:CommodityClassification>
<ns2:AdditionalItemProperty>
<Name>Weight</Name>
<Value>0</Value>
</ns2:AdditionalItemProperty>
<ns2:AdditionalItemProperty>
<Name>Tare</Name>
<Value>0</Value>
</ns2:AdditionalItemProperty>
<ns2:ManufacturerParty>
<ns2:PartyIdentification>
<ID>847532</ID>
</ns2:PartyIdentification>
</ns2:ManufacturerParty>
</ns3:Item>
<ns3:Item>
<Description>ITEM2</Description>
<PackSizeNumeric>10</PackSizeNumeric>
<ns2:BuyersItemIdentification>
<ID>9568475</ID>
</ns2:BuyersItemIdentification>
<ns2:CommodityClassification>
<CommodityCode>348454</CommodityCode>
</ns2:CommodityClassification>
<ns2:AdditionalItemProperty>
<Name>Weight</Name>
<Value>0</Value>
</ns2:AdditionalItemProperty>
<ns2:AdditionalItemProperty>
<Name>Tare</Name>
<Value>0</Value>
</ns2:AdditionalItemProperty>
<ns2:ManufacturerParty>
<ns2:PartyIdentification>
<ID>7542125</ID>
</ns2:PartyIdentification>
</ns2:ManufacturerParty>
</ns3:Item>
</ns3:BOX>
我正在尝试将其转换为 CSV 文件。
我得到了内容:
[xml]$inputFile = Get-Content test.xml
然后我导出为 CSV:
$inputfile.BOX.childnodes | Export-Csv "Stsadm-EnumSites.csv" -NoTypeInformation -Delimiter:";" -Encoding:UTF8
我得到了 Description
和 PackSizeNumeric
字段,但没有得到 :
中的其他字段
"Description";"PackSizeNumeric";"BuyersItemIdentification";"CommodityClassification";"AdditionalItemProperty";"ManufacturerParty"
"ITEM1";"6";"System.Xml.XmlElement";"System.Xml.XmlElement";"System.Object[]";"System.Xml.XmlElement"
"ITEM2";"10";"System.Xml.XmlElement";"System.Xml.XmlElement";"System.Object[]";"System.Xml.XmlElement"
获取其他命名空间中包含的字段的最佳方式是什么?
我想要这个
"Description";"PackSizeNumeric";"BuyersItemIdentification";"CommodityClassification";"Weight";"Tare";PartyIdentification
"ITEM1";"6";"75847589";"856952";"0";"0";"847532"
"ITEM2";"10";"9568475";"348454";"0";"0";"7542125"
Tomalak 的回答简洁明了,似乎是当前问题的最佳解决方案。
我试图做一些通用,但结果甚至不是请求的格式(附加属性列表很难以通用方式转换,字段名很笨重).无论如何,下面的解决方案沿着 XML 树展开,使数据变平。它不受元素名称的约束(初始 select 除外)
完成我的一般性回答后,我现在想知道编写和应用 XSLT 转换是否更好。
#[xml]$xml = Get-Content test.xml
#xml to process
$xml = [xml]@"
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns3:BOX xmlns="urn:loc.gov:item"
xmlns:ns2="urn:loc.gov:box"
xmlns:ns3="http://www.example.com/inverter"
xmlns:ns4="urn:loc.gov:xyz">
<ns3:Item>
<Description>ITEM1</Description>
<PackSizeNumeric>6</PackSizeNumeric>
<ns2:BuyersItemIdentification>
<ID>75847589</ID>
</ns2:BuyersItemIdentification>
<ns2:CommodityClassification>
<CommodityCode>856952</CommodityCode>
</ns2:CommodityClassification>
<ns2:AdditionalItemProperty>
<Name>Weight</Name>
<Value>0</Value>
</ns2:AdditionalItemProperty>
<ns2:AdditionalItemProperty>
<Name>Tare</Name>
<Value>0</Value>
</ns2:AdditionalItemProperty>
<ns2:ManufacturerParty>
<ns2:PartyIdentification>
<ID>847532</ID>
</ns2:PartyIdentification>
</ns2:ManufacturerParty>
</ns3:Item>
<ns3:Item>
<Description>ITEM2</Description>
<PackSizeNumeric>10</PackSizeNumeric>
<ns2:BuyersItemIdentification>
<ID>9568475</ID>
</ns2:BuyersItemIdentification>
<ns2:CommodityClassification>
<CommodityCode>348454</CommodityCode>
</ns2:CommodityClassification>
<ns2:AdditionalItemProperty>
<Name>Weight</Name>
<Value>0</Value>
</ns2:AdditionalItemProperty>
<ns2:AdditionalItemProperty>
<Name>Tare</Name>
<Value>0</Value>
</ns2:AdditionalItemProperty>
<ns2:ManufacturerParty>
<ns2:PartyIdentification>
<ID>7542125</ID>
</ns2:PartyIdentification>
</ns2:ManufacturerParty>
</ns3:Item>
</ns3:BOX>
"@
$nsm = [Xml.XmlNamespaceManager]$xml.NameTable
$nsm.AddNamespace("ns1","urn:loc.gov:item")
$nsm.AddNamespace("ns2","urn:loc.gov:box")
$nsm.AddNamespace("ns3","http://www.example.com/inverter")
$nsm.AddNamespace("ns4","urn:loc.gov:xyz")
#function to recursively flatten xml subtree into a hashtable (passed in)
function flatten-xml {
param (
$Parent,
$Element,
$Fieldname,
$HashTable
)
if ($parent -eq "") {
$label = $fieldname
} else {
$label = $parent + "_" + $fieldname
}
#write-host "$label is $($element.GetType())"
if ($element.GetType() -eq [System.Xml.XmlElement]) {
#get property fields
$element | Get-Member | ? { $_.MemberType -eq "Property" } | % {
#write-host "moving from $label to $($_.Name)"
flatten-xml -Parent $label -Element $element.($_.Name) -FieldName $_.Name -HashTable $HashTable
}
}elseif($element.GetType() -eq [System.Object[]]) {
#write-host "$label is an array"
$i = 0
$element | % { flatten-xml -Parent $label -Element $_ -FieldName "item$i" -HashTable $HashTable; $i++ }
}else {
$HashTable[$label] = $element
}
}
#convert the nodecollection returned by xpath query into hashtables and write them out to CSV
$xml.SelectNodes("//ns3:BOX/ns3:Item",$nsm) | % {
$element = $_
$ht = @{}
$element | Get-Member | ? { $_.MemberType -eq "Property" } | % {
flatten-xml -Parent "" -Element $element.($_.Name) -FieldName $_.Name -HashTable $ht
}
[PSCustomObject]$ht
} | Export-Csv "test2.csv" -NoTypeInformation -Delimiter:";" -Encoding:UTF8
结果:
> gc .\test2.csv
"AdditionalItemProperty_item0_Name";"AdditionalItemProperty_item0_Value";"AdditionalItemProperty_item1_Name";"AdditionalItemProperty_item1_Value";"BuyersItemIdentification_ID";"CommodityClassification_CommodityCode";"Description";"ManufacturerParty_PartyIdentification_ID";"PackSizeNumeric"
"Weight" ;"0" ;"Tare" ;"0" ;"75847589" ;"856952" ;"ITEM1" ;"847532" ;"6"
"Weight" ;"0" ;"Tare" ;"0" ;"9568475" ;"348454" ;"ITEM2" ;"7542125" ;"10"
参考文献:
- Powershell loop through xml to create a jagged array
- flatten xml structure
Select-Object
和 Select-Xml
的组合似乎效果很好:
$ns = @{
item="urn:loc.gov:item"
ns2="urn:loc.gov:box"
ns3="http://www.example.com/inverter"
ns4="urn:loc.gov:xyz"
}
$doc = New-Object xml
$doc.Load("test.xml")
$doc.BOX.ChildNodes | Select-Object -Property `
Description,`
PackSizeNumeric, `
@{Name="BuyersItemIdentification_ID"; Expression={$_.BuyersItemIdentification.ID}}, `
@{Name="CommodityClassification_CommodityCode"; Expression={$_.CommodityClassification.CommodityCode}}, `
@{Name="Weight"; Expression={Select-Xml -Namespace $ns -Xml $_ -XPath "./ns2:AdditionalItemProperty[item:Name = 'Weight']/item:Value"}}, `
@{Name="Tare"; Expression={Select-Xml -Namespace $ns -Xml $_ -XPath "./ns2:AdditionalItemProperty[item:Name = 'Tare']/item:Value"}}, `
@{Name="ManufacturerParty_ID"; Expression={$_.ManufacturerParty.PartyIdentification.ID}} `
| Export-Csv "Stsadm-EnumSites.csv" -NoTypeInformation -Delimiter:";" -Encoding:UTF8
结果 (Stsadm-EnumSites.csv
)
"Description";"PackSizeNumeric";"BuyersItemIdentification_ID";"CommodityClassification_CommodityCode";"Weight";"Tare";"ManufacturerParty_ID"
"ITEM1";"6";"75847589";"856952";"0";"0";"847532"
"ITEM2";"10";"9568475";"348454";"0";"0";"7542125"
我有这个 XML 文件:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns3:BOX xmlns="urn:loc.gov:item"
xmlns:ns2="urn:loc.gov:box"
xmlns:ns3="http://www.example.com/inverter"
xmlns:ns4="urn:loc.gov:xyz">
<ns3:Item>
<Description>ITEM1</Description>
<PackSizeNumeric>6</PackSizeNumeric>
<ns2:BuyersItemIdentification>
<ID>75847589</ID>
</ns2:BuyersItemIdentification>
<ns2:CommodityClassification>
<CommodityCode>856952</CommodityCode>
</ns2:CommodityClassification>
<ns2:AdditionalItemProperty>
<Name>Weight</Name>
<Value>0</Value>
</ns2:AdditionalItemProperty>
<ns2:AdditionalItemProperty>
<Name>Tare</Name>
<Value>0</Value>
</ns2:AdditionalItemProperty>
<ns2:ManufacturerParty>
<ns2:PartyIdentification>
<ID>847532</ID>
</ns2:PartyIdentification>
</ns2:ManufacturerParty>
</ns3:Item>
<ns3:Item>
<Description>ITEM2</Description>
<PackSizeNumeric>10</PackSizeNumeric>
<ns2:BuyersItemIdentification>
<ID>9568475</ID>
</ns2:BuyersItemIdentification>
<ns2:CommodityClassification>
<CommodityCode>348454</CommodityCode>
</ns2:CommodityClassification>
<ns2:AdditionalItemProperty>
<Name>Weight</Name>
<Value>0</Value>
</ns2:AdditionalItemProperty>
<ns2:AdditionalItemProperty>
<Name>Tare</Name>
<Value>0</Value>
</ns2:AdditionalItemProperty>
<ns2:ManufacturerParty>
<ns2:PartyIdentification>
<ID>7542125</ID>
</ns2:PartyIdentification>
</ns2:ManufacturerParty>
</ns3:Item>
</ns3:BOX>
我正在尝试将其转换为 CSV 文件。
我得到了内容:
[xml]$inputFile = Get-Content test.xml
然后我导出为 CSV:
$inputfile.BOX.childnodes | Export-Csv "Stsadm-EnumSites.csv" -NoTypeInformation -Delimiter:";" -Encoding:UTF8
我得到了 Description
和 PackSizeNumeric
字段,但没有得到 :
"Description";"PackSizeNumeric";"BuyersItemIdentification";"CommodityClassification";"AdditionalItemProperty";"ManufacturerParty"
"ITEM1";"6";"System.Xml.XmlElement";"System.Xml.XmlElement";"System.Object[]";"System.Xml.XmlElement"
"ITEM2";"10";"System.Xml.XmlElement";"System.Xml.XmlElement";"System.Object[]";"System.Xml.XmlElement"
获取其他命名空间中包含的字段的最佳方式是什么?
我想要这个
"Description";"PackSizeNumeric";"BuyersItemIdentification";"CommodityClassification";"Weight";"Tare";PartyIdentification
"ITEM1";"6";"75847589";"856952";"0";"0";"847532"
"ITEM2";"10";"9568475";"348454";"0";"0";"7542125"
Tomalak 的回答简洁明了,似乎是当前问题的最佳解决方案。
我试图做一些通用,但结果甚至不是请求的格式(附加属性列表很难以通用方式转换,字段名很笨重).无论如何,下面的解决方案沿着 XML 树展开,使数据变平。它不受元素名称的约束(初始 select 除外)
完成我的一般性回答后,我现在想知道编写和应用 XSLT 转换是否更好。
#[xml]$xml = Get-Content test.xml
#xml to process
$xml = [xml]@"
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns3:BOX xmlns="urn:loc.gov:item"
xmlns:ns2="urn:loc.gov:box"
xmlns:ns3="http://www.example.com/inverter"
xmlns:ns4="urn:loc.gov:xyz">
<ns3:Item>
<Description>ITEM1</Description>
<PackSizeNumeric>6</PackSizeNumeric>
<ns2:BuyersItemIdentification>
<ID>75847589</ID>
</ns2:BuyersItemIdentification>
<ns2:CommodityClassification>
<CommodityCode>856952</CommodityCode>
</ns2:CommodityClassification>
<ns2:AdditionalItemProperty>
<Name>Weight</Name>
<Value>0</Value>
</ns2:AdditionalItemProperty>
<ns2:AdditionalItemProperty>
<Name>Tare</Name>
<Value>0</Value>
</ns2:AdditionalItemProperty>
<ns2:ManufacturerParty>
<ns2:PartyIdentification>
<ID>847532</ID>
</ns2:PartyIdentification>
</ns2:ManufacturerParty>
</ns3:Item>
<ns3:Item>
<Description>ITEM2</Description>
<PackSizeNumeric>10</PackSizeNumeric>
<ns2:BuyersItemIdentification>
<ID>9568475</ID>
</ns2:BuyersItemIdentification>
<ns2:CommodityClassification>
<CommodityCode>348454</CommodityCode>
</ns2:CommodityClassification>
<ns2:AdditionalItemProperty>
<Name>Weight</Name>
<Value>0</Value>
</ns2:AdditionalItemProperty>
<ns2:AdditionalItemProperty>
<Name>Tare</Name>
<Value>0</Value>
</ns2:AdditionalItemProperty>
<ns2:ManufacturerParty>
<ns2:PartyIdentification>
<ID>7542125</ID>
</ns2:PartyIdentification>
</ns2:ManufacturerParty>
</ns3:Item>
</ns3:BOX>
"@
$nsm = [Xml.XmlNamespaceManager]$xml.NameTable
$nsm.AddNamespace("ns1","urn:loc.gov:item")
$nsm.AddNamespace("ns2","urn:loc.gov:box")
$nsm.AddNamespace("ns3","http://www.example.com/inverter")
$nsm.AddNamespace("ns4","urn:loc.gov:xyz")
#function to recursively flatten xml subtree into a hashtable (passed in)
function flatten-xml {
param (
$Parent,
$Element,
$Fieldname,
$HashTable
)
if ($parent -eq "") {
$label = $fieldname
} else {
$label = $parent + "_" + $fieldname
}
#write-host "$label is $($element.GetType())"
if ($element.GetType() -eq [System.Xml.XmlElement]) {
#get property fields
$element | Get-Member | ? { $_.MemberType -eq "Property" } | % {
#write-host "moving from $label to $($_.Name)"
flatten-xml -Parent $label -Element $element.($_.Name) -FieldName $_.Name -HashTable $HashTable
}
}elseif($element.GetType() -eq [System.Object[]]) {
#write-host "$label is an array"
$i = 0
$element | % { flatten-xml -Parent $label -Element $_ -FieldName "item$i" -HashTable $HashTable; $i++ }
}else {
$HashTable[$label] = $element
}
}
#convert the nodecollection returned by xpath query into hashtables and write them out to CSV
$xml.SelectNodes("//ns3:BOX/ns3:Item",$nsm) | % {
$element = $_
$ht = @{}
$element | Get-Member | ? { $_.MemberType -eq "Property" } | % {
flatten-xml -Parent "" -Element $element.($_.Name) -FieldName $_.Name -HashTable $ht
}
[PSCustomObject]$ht
} | Export-Csv "test2.csv" -NoTypeInformation -Delimiter:";" -Encoding:UTF8
结果:
> gc .\test2.csv
"AdditionalItemProperty_item0_Name";"AdditionalItemProperty_item0_Value";"AdditionalItemProperty_item1_Name";"AdditionalItemProperty_item1_Value";"BuyersItemIdentification_ID";"CommodityClassification_CommodityCode";"Description";"ManufacturerParty_PartyIdentification_ID";"PackSizeNumeric"
"Weight" ;"0" ;"Tare" ;"0" ;"75847589" ;"856952" ;"ITEM1" ;"847532" ;"6"
"Weight" ;"0" ;"Tare" ;"0" ;"9568475" ;"348454" ;"ITEM2" ;"7542125" ;"10"
参考文献:
- Powershell loop through xml to create a jagged array
- flatten xml structure
Select-Object
和 Select-Xml
的组合似乎效果很好:
$ns = @{
item="urn:loc.gov:item"
ns2="urn:loc.gov:box"
ns3="http://www.example.com/inverter"
ns4="urn:loc.gov:xyz"
}
$doc = New-Object xml
$doc.Load("test.xml")
$doc.BOX.ChildNodes | Select-Object -Property `
Description,`
PackSizeNumeric, `
@{Name="BuyersItemIdentification_ID"; Expression={$_.BuyersItemIdentification.ID}}, `
@{Name="CommodityClassification_CommodityCode"; Expression={$_.CommodityClassification.CommodityCode}}, `
@{Name="Weight"; Expression={Select-Xml -Namespace $ns -Xml $_ -XPath "./ns2:AdditionalItemProperty[item:Name = 'Weight']/item:Value"}}, `
@{Name="Tare"; Expression={Select-Xml -Namespace $ns -Xml $_ -XPath "./ns2:AdditionalItemProperty[item:Name = 'Tare']/item:Value"}}, `
@{Name="ManufacturerParty_ID"; Expression={$_.ManufacturerParty.PartyIdentification.ID}} `
| Export-Csv "Stsadm-EnumSites.csv" -NoTypeInformation -Delimiter:";" -Encoding:UTF8
结果 (Stsadm-EnumSites.csv
)
"Description";"PackSizeNumeric";"BuyersItemIdentification_ID";"CommodityClassification_CommodityCode";"Weight";"Tare";"ManufacturerParty_ID" "ITEM1";"6";"75847589";"856952";"0";"0";"847532" "ITEM2";"10";"9568475";"348454";"0";"0";"7542125"