在 Python 中获取 xml 文件中的特定子项
Getting specific child in xml file in Python
如何获取“代码”和“修改日期”属性中的内容?图片显示了该标签“IntraModelReport”中的内容。我知道它打印出标签 IntraModelReport 中的所有内容。但我只想要该标签中的两个属性。我还想指出,我在最后收到一条错误消息,上面写着“find() 缺少 1 个必需的位置参数:'self'”
from bs4 import BeautifulSoup
with open(r'INTERACTION CDM.cdm') as f:
data =f.read()
#passing the stored data inside the beautiful soup parser
soup = BeautifulSoup(data, 'xml')
unquieID = soup.find('ObjectID')
print(unquieID)
#Finding all instances of a tag.
intraModelReportTag = soup.find("IntraModelReport")
print(intraModelReportTag)
tag = soup.find(attrs={"IntraModelReport" : "Code"})
output = tag['Code']
print(tag)
print(output)
<Model xmlns:a="attribute" xmlns:c="collection" xmlns:o="object">
<o:RootObject Id="o1">
<a:SessionID>00000000-0000-0000-0000-000000000000</a:SessionID>
<c:Children>
<o:Model Id="o2">
<a:ObjectID>875D4C90-849D-43C2-827A-0BE7CA7265A4</a:ObjectID>
<a:Name>INTERACTION CDM</a:Name>
<a:Code>INTERACTION_CDM</a:Code>
<a:CreationDate>1578996736</a:CreationDate>
<a:Creator>b0000001</a:Creator>
<a:ModificationDate>1582198848</a:ModificationDate>
<a:Modifier>b0000001</a:Modifier>
<a:PackageOptionsText>[FolderOptions]
[FolderOptions\Conceptual Data Objects]
GenerationCheckModel=Yes
GenerationPath=
GenerationOptions=
GenerationTasks=
GenerationTargets=
GenerationSelections=</a:PackageOptionsText>
<a:ModelOptionsText>[ModelOptions]
.....
<c:Reports>
<o:IntraModelReport Id="o76">
<a:ObjectID>72517613-3F32-4E3D-8E4A-CDD186B0CBA3</a:ObjectID>
<a:Name>INTERACTION CDM</a:Name>
<a:Code>INTERACTION_CDM</a:Code>
<a:CreationDate>1578997381</a:CreationDate>
<a:ModificationDate>1578997500</a:ModificationDate>
<a:Modifier>b0000001</a:Modifier>
<a:ReportFirstPageTitle>INTERACTION CDM</a:ReportFirstPageTitle>
<a:ReportFirstPageAuthor>b0000001</a:ReportFirstPageAuthor>
<a:ReportFirstPageDate>%DATE%</a:ReportFirstPageDate>
<a:HtmlStylesheetFile>PWI_Theme.css</a:HtmlStylesheetFile>
<a:HtmlHeaderFile>Header_PWI.htm</a:HtmlHeaderFile>
<a:HtmlFooterFile>Footer_PWI.htm</a:HtmlFooterFile>
<a:HtmlHeaderSize>54</a:HtmlHeaderSize>
<a:HtmlFooterSize>18</a:HtmlFooterSize>
<a:HtmlTOCLevel>4</a:HtmlTOCLevel>
<a:HtmlHomePageFile>Home_PWI.html</a:HtmlHomePageFile>
<a:HtmlTemplate>PWI</a:HtmlTemplate>
<a:RtfTemplate>Professional</a:RtfTemplate>
<a:RtfUseSectionHeadFoot>1</a:RtfUseSectionHeadFoot>
<c:Paragraphs>
要获取“代码”和“修改日期”,请按如下方式调用标签名称:
...
intra_model_report_tag = soup.find("o:IntraModelReport")
print(intra_model_report_tag.find("a:Code").text)
print(intra_model_report_tag.find("a:ModificationDate").text)
输出:
INTERACTION_CDM
1578997500
如何获取“代码”和“修改日期”属性中的内容?图片显示了该标签“IntraModelReport”中的内容。我知道它打印出标签 IntraModelReport 中的所有内容。但我只想要该标签中的两个属性。我还想指出,我在最后收到一条错误消息,上面写着“find() 缺少 1 个必需的位置参数:'self'”
from bs4 import BeautifulSoup
with open(r'INTERACTION CDM.cdm') as f:
data =f.read()
#passing the stored data inside the beautiful soup parser
soup = BeautifulSoup(data, 'xml')
unquieID = soup.find('ObjectID')
print(unquieID)
#Finding all instances of a tag.
intraModelReportTag = soup.find("IntraModelReport")
print(intraModelReportTag)
tag = soup.find(attrs={"IntraModelReport" : "Code"})
output = tag['Code']
print(tag)
print(output)
<Model xmlns:a="attribute" xmlns:c="collection" xmlns:o="object">
<o:RootObject Id="o1">
<a:SessionID>00000000-0000-0000-0000-000000000000</a:SessionID>
<c:Children>
<o:Model Id="o2">
<a:ObjectID>875D4C90-849D-43C2-827A-0BE7CA7265A4</a:ObjectID>
<a:Name>INTERACTION CDM</a:Name>
<a:Code>INTERACTION_CDM</a:Code>
<a:CreationDate>1578996736</a:CreationDate>
<a:Creator>b0000001</a:Creator>
<a:ModificationDate>1582198848</a:ModificationDate>
<a:Modifier>b0000001</a:Modifier>
<a:PackageOptionsText>[FolderOptions]
[FolderOptions\Conceptual Data Objects]
GenerationCheckModel=Yes
GenerationPath=
GenerationOptions=
GenerationTasks=
GenerationTargets=
GenerationSelections=</a:PackageOptionsText>
<a:ModelOptionsText>[ModelOptions]
.....
<c:Reports>
<o:IntraModelReport Id="o76">
<a:ObjectID>72517613-3F32-4E3D-8E4A-CDD186B0CBA3</a:ObjectID>
<a:Name>INTERACTION CDM</a:Name>
<a:Code>INTERACTION_CDM</a:Code>
<a:CreationDate>1578997381</a:CreationDate>
<a:ModificationDate>1578997500</a:ModificationDate>
<a:Modifier>b0000001</a:Modifier>
<a:ReportFirstPageTitle>INTERACTION CDM</a:ReportFirstPageTitle>
<a:ReportFirstPageAuthor>b0000001</a:ReportFirstPageAuthor>
<a:ReportFirstPageDate>%DATE%</a:ReportFirstPageDate>
<a:HtmlStylesheetFile>PWI_Theme.css</a:HtmlStylesheetFile>
<a:HtmlHeaderFile>Header_PWI.htm</a:HtmlHeaderFile>
<a:HtmlFooterFile>Footer_PWI.htm</a:HtmlFooterFile>
<a:HtmlHeaderSize>54</a:HtmlHeaderSize>
<a:HtmlFooterSize>18</a:HtmlFooterSize>
<a:HtmlTOCLevel>4</a:HtmlTOCLevel>
<a:HtmlHomePageFile>Home_PWI.html</a:HtmlHomePageFile>
<a:HtmlTemplate>PWI</a:HtmlTemplate>
<a:RtfTemplate>Professional</a:RtfTemplate>
<a:RtfUseSectionHeadFoot>1</a:RtfUseSectionHeadFoot>
<c:Paragraphs>
要获取“代码”和“修改日期”,请按如下方式调用标签名称:
...
intra_model_report_tag = soup.find("o:IntraModelReport")
print(intra_model_report_tag.find("a:Code").text)
print(intra_model_report_tag.find("a:ModificationDate").text)
输出:
INTERACTION_CDM
1578997500