如何在 Python 的 xml 文件中找到特定标签?
how do I find specific tag in xml file in Python?
我有一个 XML 文件,我试图在其中找到一个特定的标签。但标签的雇佣顺序不同。我尝试找到标签“MotionVectore”,然后计算特定帧类型(P、B 或 I 帧)的平均运动矢量值。在下面我放了这个 XML 文件的一部分:
<Picture id="1" poc="1">
<GOPNr>0</GOPNr>
<SubPicture structure="0">
<Slice num="0">
<Type>0</Type>
<TypeString>SLICE_TYPE_P</TypeString>
<NAL>
<Num>5</Num>
<Type>1</Type>
<TypeString>NALU_TYPE_SLICE</TypeString>
<Length>47048</Length>
</NAL>
<MacroBlock num="0">
<MotionVector list="0">
<RefIdx>0</RefIdx>
<Difference>
<X>184</X>
<Y>149</Y>
</Difference>
<Absolute>
<X>184</X>
<Y>149</Y>
</Absolute>
</MotionVector>
<MotionVector list="0">
<RefIdx>0</RefIdx>
<Difference>
<X>10</X>
<Y>0</Y>
</Difference>
<Absolute>
<X>194</X>
<Y>149</Y>
</Absolute>
</MotionVector>
<Position>
<X>0</X>
<Y>0</Y>
</Position>
<QP_Y>21</QP_Y>
<Type>1</Type>
<TypeString>P_L0_L0_16x8</TypeString>
<PredModeString>BLOCK_TYPE_P</PredModeString>
<SkipFlag>0</SkipFlag>
</MacroBlock>
<MacroBlock num="1">
<SubMacroBlock num="0">
<Type>0</Type>
<TypeString>P_L0_8x8</TypeString>
<MotionVector list="0">
<RefIdx>0</RefIdx>
<Difference>
<X>8</X>
<Y>-1</Y>
</Difference>
<Absolute>
<X>192</X>
<Y>148</Y>
</Absolute>
</MotionVector>
</SubMacroBlock>
</MacroBlock>
</Slice>
</SubPicture>
</Picture>
如您所见,实现 X
和 Y
值的标签顺序是 Picture/SubPicture/Slice/MacroBlock/MotionVector/Absolute/X
,但有时这个顺序是 Picture/SubPicture/Slice/MacroBlock/SubMacroBlock/MotionVector/Absolute/X
所以当我使用此代码
abs_x_tag=list(qpy_node.text for qpy_node in root.findall('Picture/SubPicture/Slice/MacroBlock/SubMacroBlock/MotionVector/Absolute/X'))
要提取所有 X
值,它无法提取所有 X
值,而且我必须根据此标签计算不同帧类型的运动矢量
<TypeString>SLICE_TYPE_P</TypeString>
并且基于这些限制,我不知道如何分别提取每种帧类型的 X
和 Y
值。我可以使用上述代码提取所有 X
和 Y
值,但我不知道如何根据帧类型找到这些值。你能帮我解决这个问题吗?谢谢。
这是一个例子,你如何用 BeautifulSoup
解析这个 xml
正在安装 BeautifulSoup 和 lxml
pip install BeautifulSoup4 lxml
代码:
from bs4 import BeautifulSoup
XML = """
<Picture id="1" poc="1">
<GOPNr>0</GOPNr>
<SubPicture structure="0">
<Slice num="0">
<Type>0</Type>
<TypeString>SLICE_TYPE_P</TypeString>
<NAL>
<Num>5</Num>
<Type>1</Type>
<TypeString>NALU_TYPE_SLICE</TypeString>
<Length>47048</Length>
</NAL>
<MacroBlock num="0">
<MotionVector list="0">
<RefIdx>0</RefIdx>
<Difference>
<X>184</X>
<Y>149</Y>
</Difference>
<Absolute>
<X>184</X>
<Y>149</Y>
</Absolute>
</MotionVector>
</MacroBlock>
</Slice>
</SubPicture>
</Picture>
"""
soup = BeautifulSoup(XML, 'xml')
slices = soup.find_all('Slice')
for slice in slices:
type = slice.find('TypeString').text
print(f"Type: {type}")
vectors = slice.find_all('MotionVector')
for vector in vectors:
print("Vector:")
difference = vector.find('Difference')
difference_x = difference.find('X').text
difference_y = difference.find('Y').text
absolute = vector.find('Absolute')
absolute_x = absolute.find('X').text
absolute_y = absolute.find('Y').text
# Here you know type and x, y and type
print(f"Difference: {difference_x}, {difference_y}")
print(f"Absolute: {absolute_x}, {absolute_y}")
输出:
Type: SLICE_TYPE_P
Vector:
Difference: 184, 149
Absolute: 184, 149
我们可以用简单的方式来做,看看下面的输出:
import xml.etree.ElementTree as ET
SampleXML = """
<Picture id="1" poc="1">
<GOPNr>0</GOPNr>
<SubPicture structure="0">
<Slice num="0">
<Type>0</Type>
<TypeString>SLICE_TYPE_P</TypeString>
<NAL>
<Num>5</Num>
<Type>1</Type>
<TypeString>NALU_TYPE_SLICE</TypeString>
<Length>47048</Length>
</NAL>
<MacroBlock num="0">
<MotionVector list="0">
<RefIdx>0</RefIdx>
<Difference>
<X>184</X>
<Y>149</Y>
</Difference>
<Absolute>
<X>184</X>
<Y>149</Y>
</Absolute>
</MotionVector>
</MacroBlock>
</Slice>
</SubPicture>
</Picture>
"""
# use below commented lines if you are reading from xml file and replace XMl absolute path with <InputXML>
# tree = ET.parse(r"<InputXML>")
# root = tree.getroot()
root = ET.fromstring(SampleXML)
TypeString = root.findall("./SubPicture/Slice/TypeString")
print("TypeString: ", TypeString[0].text)
abs_x_tag = root.findall("./SubPicture/Slice/MacroBlock/MotionVector/Absolute/X") or root.findall("./SubPicture/Slice/MacroBlock/SubMacroBlock/MotionVector/Absolute/X")
print("abs_x_tag: ", abs_x_tag[0].text)
输出:
类型字符串:SLICE_TYPE_P
abs_x_tag: 184
我有一个 XML 文件,我试图在其中找到一个特定的标签。但标签的雇佣顺序不同。我尝试找到标签“MotionVectore”,然后计算特定帧类型(P、B 或 I 帧)的平均运动矢量值。在下面我放了这个 XML 文件的一部分:
<Picture id="1" poc="1">
<GOPNr>0</GOPNr>
<SubPicture structure="0">
<Slice num="0">
<Type>0</Type>
<TypeString>SLICE_TYPE_P</TypeString>
<NAL>
<Num>5</Num>
<Type>1</Type>
<TypeString>NALU_TYPE_SLICE</TypeString>
<Length>47048</Length>
</NAL>
<MacroBlock num="0">
<MotionVector list="0">
<RefIdx>0</RefIdx>
<Difference>
<X>184</X>
<Y>149</Y>
</Difference>
<Absolute>
<X>184</X>
<Y>149</Y>
</Absolute>
</MotionVector>
<MotionVector list="0">
<RefIdx>0</RefIdx>
<Difference>
<X>10</X>
<Y>0</Y>
</Difference>
<Absolute>
<X>194</X>
<Y>149</Y>
</Absolute>
</MotionVector>
<Position>
<X>0</X>
<Y>0</Y>
</Position>
<QP_Y>21</QP_Y>
<Type>1</Type>
<TypeString>P_L0_L0_16x8</TypeString>
<PredModeString>BLOCK_TYPE_P</PredModeString>
<SkipFlag>0</SkipFlag>
</MacroBlock>
<MacroBlock num="1">
<SubMacroBlock num="0">
<Type>0</Type>
<TypeString>P_L0_8x8</TypeString>
<MotionVector list="0">
<RefIdx>0</RefIdx>
<Difference>
<X>8</X>
<Y>-1</Y>
</Difference>
<Absolute>
<X>192</X>
<Y>148</Y>
</Absolute>
</MotionVector>
</SubMacroBlock>
</MacroBlock>
</Slice>
</SubPicture>
</Picture>
如您所见,实现 X
和 Y
值的标签顺序是 Picture/SubPicture/Slice/MacroBlock/MotionVector/Absolute/X
,但有时这个顺序是 Picture/SubPicture/Slice/MacroBlock/SubMacroBlock/MotionVector/Absolute/X
所以当我使用此代码
abs_x_tag=list(qpy_node.text for qpy_node in root.findall('Picture/SubPicture/Slice/MacroBlock/SubMacroBlock/MotionVector/Absolute/X'))
要提取所有 X
值,它无法提取所有 X
值,而且我必须根据此标签计算不同帧类型的运动矢量
<TypeString>SLICE_TYPE_P</TypeString>
并且基于这些限制,我不知道如何分别提取每种帧类型的 X
和 Y
值。我可以使用上述代码提取所有 X
和 Y
值,但我不知道如何根据帧类型找到这些值。你能帮我解决这个问题吗?谢谢。
这是一个例子,你如何用 BeautifulSoup
解析这个 xml正在安装 BeautifulSoup 和 lxml
pip install BeautifulSoup4 lxml
代码:
from bs4 import BeautifulSoup
XML = """
<Picture id="1" poc="1">
<GOPNr>0</GOPNr>
<SubPicture structure="0">
<Slice num="0">
<Type>0</Type>
<TypeString>SLICE_TYPE_P</TypeString>
<NAL>
<Num>5</Num>
<Type>1</Type>
<TypeString>NALU_TYPE_SLICE</TypeString>
<Length>47048</Length>
</NAL>
<MacroBlock num="0">
<MotionVector list="0">
<RefIdx>0</RefIdx>
<Difference>
<X>184</X>
<Y>149</Y>
</Difference>
<Absolute>
<X>184</X>
<Y>149</Y>
</Absolute>
</MotionVector>
</MacroBlock>
</Slice>
</SubPicture>
</Picture>
"""
soup = BeautifulSoup(XML, 'xml')
slices = soup.find_all('Slice')
for slice in slices:
type = slice.find('TypeString').text
print(f"Type: {type}")
vectors = slice.find_all('MotionVector')
for vector in vectors:
print("Vector:")
difference = vector.find('Difference')
difference_x = difference.find('X').text
difference_y = difference.find('Y').text
absolute = vector.find('Absolute')
absolute_x = absolute.find('X').text
absolute_y = absolute.find('Y').text
# Here you know type and x, y and type
print(f"Difference: {difference_x}, {difference_y}")
print(f"Absolute: {absolute_x}, {absolute_y}")
输出:
Type: SLICE_TYPE_P
Vector:
Difference: 184, 149
Absolute: 184, 149
我们可以用简单的方式来做,看看下面的输出:
import xml.etree.ElementTree as ET
SampleXML = """
<Picture id="1" poc="1">
<GOPNr>0</GOPNr>
<SubPicture structure="0">
<Slice num="0">
<Type>0</Type>
<TypeString>SLICE_TYPE_P</TypeString>
<NAL>
<Num>5</Num>
<Type>1</Type>
<TypeString>NALU_TYPE_SLICE</TypeString>
<Length>47048</Length>
</NAL>
<MacroBlock num="0">
<MotionVector list="0">
<RefIdx>0</RefIdx>
<Difference>
<X>184</X>
<Y>149</Y>
</Difference>
<Absolute>
<X>184</X>
<Y>149</Y>
</Absolute>
</MotionVector>
</MacroBlock>
</Slice>
</SubPicture>
</Picture>
"""
# use below commented lines if you are reading from xml file and replace XMl absolute path with <InputXML>
# tree = ET.parse(r"<InputXML>")
# root = tree.getroot()
root = ET.fromstring(SampleXML)
TypeString = root.findall("./SubPicture/Slice/TypeString")
print("TypeString: ", TypeString[0].text)
abs_x_tag = root.findall("./SubPicture/Slice/MacroBlock/MotionVector/Absolute/X") or root.findall("./SubPicture/Slice/MacroBlock/SubMacroBlock/MotionVector/Absolute/X")
print("abs_x_tag: ", abs_x_tag[0].text)
输出:
类型字符串:SLICE_TYPE_P
abs_x_tag: 184