如何使用 pandas 正确显示 xml 结构?
How do I correctly display xml structure with pandas?
我正在寻找有关如何正确显示此内容的一些见解 xml:
<?xml version="1.0" encoding="UTF-8"?>
<HEADER>
<PRODUCT>
<SUPPLIER>015</SUPPLIER>
<PRODUCT_DETAILS>
<KEYWORD>Paper</KEYWORD>
<PRODUCT_TYPE>major</PRODUCT_TYPE>
</PRODUCT_DETAILS>
<PRODUCT_FEATURES>
<REFERENCE>Class01</REFERENCE>
<FEATURE>
<FNAME>Colour</FNAME>
<FVALUE>white</FVALUE>
</FEATURE>
</PRODUCT_FEATURES>
</PRODUCT>
</HEADER>
对于更简单的结构,它看起来像这样:
<?xml version="1.0" encoding="UTF-8"?>
<HEADER>
<PRODUCT_DETAILS>
<KEYWORD>Paper</KEYWORD>
<PRODUCT_TYPE>major</PRODUCT_TYPE>
</PRODUCT_DETAILS>
<PRODUCT_FEATURES>
<FEATURE>
<FNAME>Colour</FNAME>
<FVALUE>white</FVALUE>
</FEATURE>
</PRODUCT_FEATURES>
</HEADER>
我写了几行,如下所示:
import xml.etree.ElementTree as ET
import pandas as pd
tree = ET.parse('file.xml')
root = tree.getroot()
df = pd.DataFrame()
for i in range(0, len(root), 2):
details = [(child.tag, child.text) for child in root[i + 0]]
features = [(child[0].text, child[1].text) for child in root[i + 1]]
temp_df = pd.DataFrame([[i[1] for i in details + features]], columns=[i[0] for i in details + features])
df = pd.concat([df, temp_df])
df
# df.to_csv("file_export.csv", index=False)
... 并产生此输出:
KEYWORD PRODUCT_TYPE Colour
0 Paper major white
我需要进行哪些编辑才能输出:
SUPPLIER KEYWORD PRODUCT_TYPE REFERENCE Colour
0 015 Paper major Class01 white
感谢您的帮助!
最好,
~C
以下将完成这项工作
import xml.etree.ElementTree as ET
import pandas as pd
xml = '''<?xml version="1.0" encoding="UTF-8"?>
<HEADER>
<PRODUCT>
<SUPPLIER>015</SUPPLIER>
<PRODUCT_DETAILS>
<KEYWORD>Paper</KEYWORD>
<PRODUCT_TYPE>major</PRODUCT_TYPE>
</PRODUCT_DETAILS>
<PRODUCT_FEATURES>
<REFERENCE>Class01</REFERENCE>
<FEATURE>
<FNAME>Colour</FNAME>
<FVALUE>white</FVALUE>
</FEATURE>
</PRODUCT_FEATURES>
</PRODUCT>
</HEADER>'''
elements = ['SUPPLIER','KEYWORD','PRODUCT_TYPE','REFERENCE','FNAME','FVALUE']
root = ET.fromstring(xml)
data = {e:root.find(f'.//{e}').text for e in elements}
data[data['FNAME']] = data['FVALUE']
del data['FVALUE']
del data['FNAME']
df = pd.DataFrame([data])
print(df)
输出
SUPPLIER KEYWORD PRODUCT_TYPE REFERENCE Colour
0 015 Paper major Class01 white
我正在寻找有关如何正确显示此内容的一些见解 xml:
<?xml version="1.0" encoding="UTF-8"?>
<HEADER>
<PRODUCT>
<SUPPLIER>015</SUPPLIER>
<PRODUCT_DETAILS>
<KEYWORD>Paper</KEYWORD>
<PRODUCT_TYPE>major</PRODUCT_TYPE>
</PRODUCT_DETAILS>
<PRODUCT_FEATURES>
<REFERENCE>Class01</REFERENCE>
<FEATURE>
<FNAME>Colour</FNAME>
<FVALUE>white</FVALUE>
</FEATURE>
</PRODUCT_FEATURES>
</PRODUCT>
</HEADER>
对于更简单的结构,它看起来像这样:
<?xml version="1.0" encoding="UTF-8"?>
<HEADER>
<PRODUCT_DETAILS>
<KEYWORD>Paper</KEYWORD>
<PRODUCT_TYPE>major</PRODUCT_TYPE>
</PRODUCT_DETAILS>
<PRODUCT_FEATURES>
<FEATURE>
<FNAME>Colour</FNAME>
<FVALUE>white</FVALUE>
</FEATURE>
</PRODUCT_FEATURES>
</HEADER>
我写了几行,如下所示:
import xml.etree.ElementTree as ET
import pandas as pd
tree = ET.parse('file.xml')
root = tree.getroot()
df = pd.DataFrame()
for i in range(0, len(root), 2):
details = [(child.tag, child.text) for child in root[i + 0]]
features = [(child[0].text, child[1].text) for child in root[i + 1]]
temp_df = pd.DataFrame([[i[1] for i in details + features]], columns=[i[0] for i in details + features])
df = pd.concat([df, temp_df])
df
# df.to_csv("file_export.csv", index=False)
... 并产生此输出:
KEYWORD PRODUCT_TYPE Colour
0 Paper major white
我需要进行哪些编辑才能输出:
SUPPLIER KEYWORD PRODUCT_TYPE REFERENCE Colour
0 015 Paper major Class01 white
感谢您的帮助!
最好, ~C
以下将完成这项工作
import xml.etree.ElementTree as ET
import pandas as pd
xml = '''<?xml version="1.0" encoding="UTF-8"?>
<HEADER>
<PRODUCT>
<SUPPLIER>015</SUPPLIER>
<PRODUCT_DETAILS>
<KEYWORD>Paper</KEYWORD>
<PRODUCT_TYPE>major</PRODUCT_TYPE>
</PRODUCT_DETAILS>
<PRODUCT_FEATURES>
<REFERENCE>Class01</REFERENCE>
<FEATURE>
<FNAME>Colour</FNAME>
<FVALUE>white</FVALUE>
</FEATURE>
</PRODUCT_FEATURES>
</PRODUCT>
</HEADER>'''
elements = ['SUPPLIER','KEYWORD','PRODUCT_TYPE','REFERENCE','FNAME','FVALUE']
root = ET.fromstring(xml)
data = {e:root.find(f'.//{e}').text for e in elements}
data[data['FNAME']] = data['FVALUE']
del data['FVALUE']
del data['FNAME']
df = pd.DataFrame([data])
print(df)
输出
SUPPLIER KEYWORD PRODUCT_TYPE REFERENCE Colour
0 015 Paper major Class01 white