使用 Python ElementTree 从 XML 中提取值
Pull Values From XML Using Python ElementTree
我需要从 XML 格式的报告中提取一些值。 XML 的结构相当复杂,我不知道如何引用 XML 结构中的不同级别。
我已经导入了 ElementTree 并加载了源文件,我可以成功地从树的上层提取值。不幸的是,所有 ElementTree 教程都使用非常简单的示例,我无法弄清楚深入挖掘结构所需的语法,尤其是当键和值重复时。
例如,我在这里检索“总通话”值:
tree = ET.parse('Z:/VSCode/Python/CallStats/calls.xml')
root = tree.getroot()
#Get Total Calls
for totcalls in root.iter('TotalCalls'):
print(totcalls.text)
但是这个 returns 重复值,因为 TotalCalls 标记在报告中出现两次,分别在不同级别 <Overview>
和 <KeyFacts>
,我只想从<KeyFacts>
级。
请问从 <KeyFacts>
级别检索 TotalCalls 值的语法是什么?
这是源文件:
<?xml version="1.0" encoding="UTF-8"?><Report version="1">
<ReportConfig>
<Name>Test01</Name>
<BeginDate>Fri Jan 29 05:00:00 EST 2021</BeginDate>
<EndDate>Fri Jan 29 05:14:59 EST 2021</EndDate>
<GenerationTime>Fri Jan 29 05:15:01 EST 2021</GenerationTime>
<DataSource>Live Traffic</DataSource>
<AggregationType>Summary</AggregationType>
<ReportPeriod>15 Minutes</ReportPeriod>
<Class name="Service">
<ResGroup description="" type="">All (Individual)</ResGroup>
<Service description="VOIP Service">VoIP</Service>
</Class>
</ReportConfig>
<DataAvailable>true</DataAvailable>
<Aggregation name="Summary">
<Resource name="TestInstance">
<Overview>
<AverageServiceHealth>4.1</AverageServiceHealth>
<ReportOutcomeHistogram>
<Fail>16</Fail>
<Busy>0</Busy>
<NoAnswer>0</NoAnswer>
<Answered>2323</Answered>
</ReportOutcomeHistogram>
<AverageLossPct>0.1</AverageLossPct>
<AverageJitterMS>3.3458</AverageJitterMS>
<AverageDelayMS>91.6</AverageDelayMS>
<AverageDurationSecs>312.8512</AverageDurationSecs>
<TotalCalls>565</TotalCalls>
<TotalCallMinutes>12112.5552</TotalCallMinutes>
<AnswerSeizureRatio>99.3</AnswerSeizureRatio>
<NetworkEffectivenessRatio>99.3</NetworkEffectivenessRatio>
</Overview>
<ServiceQuality xaxis="MOS-CQ" title="VoIP Service Quality">
<AverageServiceHealth>4.1</AverageServiceHealth>
<MOSHistogram title="MOS-CQ">
<Bin lower="1.0" higher="3.099">7</Bin>
<Bin lower="3.1" higher="3.199">15</Bin>
<Bin lower="3.2" higher="3.299">3</Bin>
<Bin lower="3.3" higher="3.399">0</Bin>
<Bin lower="3.4" higher="3.499">5</Bin>
<Bin lower="3.5" higher="3.599">7</Bin>
<Bin lower="3.6" higher="3.699">9</Bin>
<Bin lower="3.7" higher="3.799">15</Bin>
<Bin lower="3.8" higher="3.899">27</Bin>
<Bin lower="3.9" higher="3.999">332</Bin>
<Bin lower="4.0" higher="4.099">472</Bin>
<Bin lower="4.1" higher="4.199">55</Bin>
<Bin lower="4.2" higher="4.299">1378</Bin>
<Bin lower="4.3" higher="4.399">0</Bin>
<Bin lower="4.4" higher="5.000">0</Bin>
</MOSHistogram>
</ServiceQuality>
<KeyFacts>
<NumberOfCallAttempts>565</NumberOfCallAttempts>
<TotalCalls>565</TotalCalls>
<TotalRecords>2339</TotalRecords>
<TotalCallMinutes>12112.5552</TotalCallMinutes>
</KeyFacts>
<CallStatistics>
<AnswerSeizureRatio>99.3</AnswerSeizureRatio>
<NetworkEffectivenessRatio>99.3</NetworkEffectivenessRatio>
<AverageCallDurationSec>312.9</AverageCallDurationSec>
<NumberOfCallAttempts>565</NumberOfCallAttempts>
<ReportedOutcomes>
<Answered>2323</Answered>
<Unanswered>0</Unanswered>
<Busy>0</Busy>
<ReportedFailures>16</ReportedFailures>
<Dropped>12</Dropped>
<OneWayMedia>4</OneWayMedia>
<NoMedia>0</NoMedia>
<NotFound>0</NotFound>
<Unauthorized>0</Unauthorized>
<ServerBusy>0</ServerBusy>
<ServerError>0</ServerError>
<BadSignaling>0</BadSignaling>
</ReportedOutcomes>
<NumberOfRegistrationFailures>0</NumberOfRegistrationFailures>
</CallStatistics>
<SLA>
<SLAFailCount>0</SLAFailCount>
<RecordCount>2339</RecordCount>
<ObservedPassRate>100.000</ObservedPassRate>
</SLA>
<AlertsGenerated>
<CriticalCount>0</CriticalCount>
<WarnCount>0</WarnCount>
</AlertsGenerated>
</Resource>
</Aggregation>
</Report>
遍历 KeyFacts
个元素并获取其 TotalCalls
个子元素的文本值。
import xml.etree.ElementTree as ET
tree = ET.parse('Z:/VSCode/Python/CallStats/calls.xml')
for kf in tree.iter('KeyFacts'):
print(kf.findtext("TotalCalls"))
我需要从 XML 格式的报告中提取一些值。 XML 的结构相当复杂,我不知道如何引用 XML 结构中的不同级别。
我已经导入了 ElementTree 并加载了源文件,我可以成功地从树的上层提取值。不幸的是,所有 ElementTree 教程都使用非常简单的示例,我无法弄清楚深入挖掘结构所需的语法,尤其是当键和值重复时。
例如,我在这里检索“总通话”值:
tree = ET.parse('Z:/VSCode/Python/CallStats/calls.xml')
root = tree.getroot()
#Get Total Calls
for totcalls in root.iter('TotalCalls'):
print(totcalls.text)
但是这个 returns 重复值,因为 TotalCalls 标记在报告中出现两次,分别在不同级别 <Overview>
和 <KeyFacts>
,我只想从<KeyFacts>
级。
请问从 <KeyFacts>
级别检索 TotalCalls 值的语法是什么?
这是源文件:
<?xml version="1.0" encoding="UTF-8"?><Report version="1">
<ReportConfig>
<Name>Test01</Name>
<BeginDate>Fri Jan 29 05:00:00 EST 2021</BeginDate>
<EndDate>Fri Jan 29 05:14:59 EST 2021</EndDate>
<GenerationTime>Fri Jan 29 05:15:01 EST 2021</GenerationTime>
<DataSource>Live Traffic</DataSource>
<AggregationType>Summary</AggregationType>
<ReportPeriod>15 Minutes</ReportPeriod>
<Class name="Service">
<ResGroup description="" type="">All (Individual)</ResGroup>
<Service description="VOIP Service">VoIP</Service>
</Class>
</ReportConfig>
<DataAvailable>true</DataAvailable>
<Aggregation name="Summary">
<Resource name="TestInstance">
<Overview>
<AverageServiceHealth>4.1</AverageServiceHealth>
<ReportOutcomeHistogram>
<Fail>16</Fail>
<Busy>0</Busy>
<NoAnswer>0</NoAnswer>
<Answered>2323</Answered>
</ReportOutcomeHistogram>
<AverageLossPct>0.1</AverageLossPct>
<AverageJitterMS>3.3458</AverageJitterMS>
<AverageDelayMS>91.6</AverageDelayMS>
<AverageDurationSecs>312.8512</AverageDurationSecs>
<TotalCalls>565</TotalCalls>
<TotalCallMinutes>12112.5552</TotalCallMinutes>
<AnswerSeizureRatio>99.3</AnswerSeizureRatio>
<NetworkEffectivenessRatio>99.3</NetworkEffectivenessRatio>
</Overview>
<ServiceQuality xaxis="MOS-CQ" title="VoIP Service Quality">
<AverageServiceHealth>4.1</AverageServiceHealth>
<MOSHistogram title="MOS-CQ">
<Bin lower="1.0" higher="3.099">7</Bin>
<Bin lower="3.1" higher="3.199">15</Bin>
<Bin lower="3.2" higher="3.299">3</Bin>
<Bin lower="3.3" higher="3.399">0</Bin>
<Bin lower="3.4" higher="3.499">5</Bin>
<Bin lower="3.5" higher="3.599">7</Bin>
<Bin lower="3.6" higher="3.699">9</Bin>
<Bin lower="3.7" higher="3.799">15</Bin>
<Bin lower="3.8" higher="3.899">27</Bin>
<Bin lower="3.9" higher="3.999">332</Bin>
<Bin lower="4.0" higher="4.099">472</Bin>
<Bin lower="4.1" higher="4.199">55</Bin>
<Bin lower="4.2" higher="4.299">1378</Bin>
<Bin lower="4.3" higher="4.399">0</Bin>
<Bin lower="4.4" higher="5.000">0</Bin>
</MOSHistogram>
</ServiceQuality>
<KeyFacts>
<NumberOfCallAttempts>565</NumberOfCallAttempts>
<TotalCalls>565</TotalCalls>
<TotalRecords>2339</TotalRecords>
<TotalCallMinutes>12112.5552</TotalCallMinutes>
</KeyFacts>
<CallStatistics>
<AnswerSeizureRatio>99.3</AnswerSeizureRatio>
<NetworkEffectivenessRatio>99.3</NetworkEffectivenessRatio>
<AverageCallDurationSec>312.9</AverageCallDurationSec>
<NumberOfCallAttempts>565</NumberOfCallAttempts>
<ReportedOutcomes>
<Answered>2323</Answered>
<Unanswered>0</Unanswered>
<Busy>0</Busy>
<ReportedFailures>16</ReportedFailures>
<Dropped>12</Dropped>
<OneWayMedia>4</OneWayMedia>
<NoMedia>0</NoMedia>
<NotFound>0</NotFound>
<Unauthorized>0</Unauthorized>
<ServerBusy>0</ServerBusy>
<ServerError>0</ServerError>
<BadSignaling>0</BadSignaling>
</ReportedOutcomes>
<NumberOfRegistrationFailures>0</NumberOfRegistrationFailures>
</CallStatistics>
<SLA>
<SLAFailCount>0</SLAFailCount>
<RecordCount>2339</RecordCount>
<ObservedPassRate>100.000</ObservedPassRate>
</SLA>
<AlertsGenerated>
<CriticalCount>0</CriticalCount>
<WarnCount>0</WarnCount>
</AlertsGenerated>
</Resource>
</Aggregation>
</Report>
遍历 KeyFacts
个元素并获取其 TotalCalls
个子元素的文本值。
import xml.etree.ElementTree as ET
tree = ET.parse('Z:/VSCode/Python/CallStats/calls.xml')
for kf in tree.iter('KeyFacts'):
print(kf.findtext("TotalCalls"))