如何使用 pandas 正确显示 xml 结构?

How do I correctly display xml structure with pandas?

我正在寻找有关如何正确显示此内容的一些见解 xml:

<?xml version="1.0" encoding="UTF-8"?>
<HEADER>
    <PRODUCT>
        <SUPPLIER>015</SUPPLIER>
        <PRODUCT_DETAILS>
            <KEYWORD>Paper</KEYWORD>
            <PRODUCT_TYPE>major</PRODUCT_TYPE>
        </PRODUCT_DETAILS>
        <PRODUCT_FEATURES>
            <REFERENCE>Class01</REFERENCE>
            <FEATURE>
                <FNAME>Colour</FNAME>
                <FVALUE>white</FVALUE>
            </FEATURE>
        </PRODUCT_FEATURES>
    </PRODUCT>
</HEADER>

对于更简单的结构,它看起来像这样:

<?xml version="1.0" encoding="UTF-8"?>
<HEADER>
    <PRODUCT_DETAILS>
        <KEYWORD>Paper</KEYWORD>
        <PRODUCT_TYPE>major</PRODUCT_TYPE>
    </PRODUCT_DETAILS>
    <PRODUCT_FEATURES>
        <FEATURE>
            <FNAME>Colour</FNAME>
            <FVALUE>white</FVALUE>
        </FEATURE>
    </PRODUCT_FEATURES>
</HEADER>

我写了几行,如下所示:

import xml.etree.ElementTree as ET
import pandas as pd

tree = ET.parse('file.xml')
root = tree.getroot()

df = pd.DataFrame()

for i in range(0, len(root), 2):

    details = [(child.tag, child.text) for child in root[i + 0]]
    features = [(child[0].text, child[1].text) for child in root[i + 1]]

    temp_df = pd.DataFrame([[i[1] for i in details + features]], columns=[i[0] for i in details + features])

    df = pd.concat([df, temp_df])

df

# df.to_csv("file_export.csv", index=False)

... 并产生此输出:

    KEYWORD PRODUCT_TYPE    Colour
0   Paper   major           white

我需要进行哪些编辑才能输出:

    SUPPLIER    KEYWORD PRODUCT_TYPE    REFERENCE   Colour
0   015         Paper   major           Class01     white

感谢您的帮助!

最好, ~C

以下将完成这项工作

import xml.etree.ElementTree as ET
import pandas as pd

xml = '''<?xml version="1.0" encoding="UTF-8"?>
<HEADER>
    <PRODUCT>
        <SUPPLIER>015</SUPPLIER>
        <PRODUCT_DETAILS>
            <KEYWORD>Paper</KEYWORD>
            <PRODUCT_TYPE>major</PRODUCT_TYPE>
        </PRODUCT_DETAILS>
        <PRODUCT_FEATURES>
            <REFERENCE>Class01</REFERENCE>
            <FEATURE>
                <FNAME>Colour</FNAME>
                <FVALUE>white</FVALUE>
            </FEATURE>
        </PRODUCT_FEATURES>
    </PRODUCT>
</HEADER>'''

elements = ['SUPPLIER','KEYWORD','PRODUCT_TYPE','REFERENCE','FNAME','FVALUE']
root = ET.fromstring(xml)
data = {e:root.find(f'.//{e}').text for e in elements}
data[data['FNAME']] = data['FVALUE']
del data['FVALUE']
del data['FNAME']

df = pd.DataFrame([data])
print(df)

输出

  SUPPLIER KEYWORD PRODUCT_TYPE REFERENCE Colour
0      015   Paper        major   Class01  white