python lxml 获取节点名称

python lxml get the name of a node

这是我的 xml 文件:

<FuzzyComparison>
    <Modules>
        <Module>
            <name>AutosoukModelMakeFuzzyComparisonModule</name>
            <configurationLoader>DefaultLoader</configurationLoader>
            <configurationFile>MakesModels.conf</configurationFile>
            <settings></settings>
        </Module>
        <Module>
            <name>DefaultFuzzyComparisonModule</name>
            <configurationLoader>DefaultLoader</configurationLoader>
            <configurationFile>Buildings.conf</configurationFile>
            <settings>
                <attribute>building</attribute>
            </settings>
        </Module>
    </Modules>
    </FuzzyComparison>

这是我一直试图解析的代码:

from lxml import etree
class AttributesXMLParser():
    def __init__(self):
        self.doc=etree.parse('Items.xml')

    def getValueOfTag(self, tagName): #This function returns the value of a specific tag for exmaple, the tageName could be "FirstDate"
        return self.doc.find(tagName).text

    def loadFuzzySettings(self):
        modulesDict = list()
        modules = self.doc.findall('FuzzyComparison/Modules/Module')
        for module in modules:
            moduleDict = dict()
            moduleName = module.find('name').text
            moduleDict['name'] = moduleName
            moduleConfigurationLoader = module.find('configurationLoader').text
            moduleDict['configurationLoader'] = moduleConfigurationLoader
            moduleConfigurationFile = module.find('configurationFile').text
            moduleDict['moduleConfigurationFile'] = moduleConfigurationFile
            settings = module.findall('settings')
            settingsDict = dict()
            for oneSetting in settings:
                settingsDict[oneSetting] = oneSetting.text
            moduleDict['settings'] = settingsDict
            modulesDict.append(moduleDict)
        return modulesDict

这是结果:

[{'moduleConfigurationFile': 'MakesModels.conf', 'configurationLoader': 'Default
Loader', 'name': 'AutosoukModelMakeFuzzyComparisonModule', 'settings': {<Element
 settings at 0x25257c8>: None}}, {'moduleConfigurationFile': 'Buildings.conf', '
configurationLoader': 'DefaultLoader', 'name': 'DefaultFuzzyComparisonModule', '
settings': {<Element settings at 0x2525e48>: '\n\t\t\t\t'}}]

我的问题

我不知道如何获取 settings 节点的名称和值,因为如您所见,除了 settings 之外的一切都运行良好,我需要这样:

"attribute": building

但是我的代码给了我:

{<Element settings at 0x2525e48>: '\n\t\t\t\t'}}]

你能帮忙解决一下吗?

由于 findall() returns 是一个列表,您希望遍历该列表元素的内容,而不是列表本身。您还想使用元素的标签作为键,而不是使用元素本身。

settingsDict = {}
for settingsNode in module.findall('settings'):
    for setting in settingsNode:
        settingsDict[setting.tag] = setting.text

或者,如果您只有一个 settings 标签,

settingsDict = {}
for setting in module.find('settings'):
    settingsDict[setting.tag] = setting.text

可以简化为:

settingsDict = {setting.tag: setting.text
                for setting in module.find('settings')}