parsing .xml file using python :搜索并复制相关数据

parsing .xml file using python :search and copy related data

我想根据一些搜索值从 .xml 文件中复制一些数据。 在下面的 xml 文件中,我想搜索 0xCCB7B836 ( 0xCCB7B836 ) 并复制其中的数据

   4e564d2d52656648
   6173685374617274 
   1782af065966579e 
   899885d440d3ad67 
   d04b41b15e2b13c2

再举一个例子: 搜索值 0xECFBBA1A 和 return 0000

搜索值 0xA54E2B5A 和 return 30d4

 <MEM_DATA>
  <MEM_SECTOR>
     <MEM_SECTOR_NUMBER>0</MEM_SECTOR_NUMBER>
     <MEM_SECTOR_STATUS>ACTIVE</MEM_SECTOR_STATUS>
     <MEM_SECTOR_STARTADR>0x800000</MEM_SECTOR_STARTADR>
     <MEM_SECTOR_ENDADR>0x0</MEM_SECTOR_ENDADR>
     <MEM_SECTOR_COUNTER>0x1</MEM_SECTOR_COUNTER>
     <MEM_ERASED_MARKER>SET</MEM_ERASED_MARKER>
     <MEM_USED_MARKER>SET</MEM_USED_MARKER>
     <MEM_FULL_MARKER>NOT_SET</MEM_FULL_MARKER>
     <MEM_ERASE_MARKER>NOT_SET</MEM_ERASE_MARKER>
     <MEM_START_MARKER>SET</MEM_START_MARKER>
     <MEM_START_OFFSET>0x1</MEM_START_OFFSET>
     <MEM_CLONE_MARKER>NOT_SET</MEM_CLONE_MARKER>
       <MEM_BLOCK>
         <MEM_BLOCK_ID>0x101</MEM_BLOCK_ID>
         <MEM_BLOCK_NAME>UNKNOWN</MEM_BLOCK_NAME>
         <MEM_BLOCK_STATUS>VALID</MEM_BLOCK_STATUS>
         <MEM_BLOCK_FLAGS>0x0</MEM_BLOCK_FLAGS>
         <MEM_BLOCK_STORAGE>Emulation</MEM_BLOCK_STORAGE>
         <MEM_BLOCK_LEN>0x28</MEM_BLOCK_LEN>
         <MEM_BLOCK_VERSION>0x0</MEM_BLOCK_VERSION>
         <MEM_BLOCK_HEADER_CRC>0xE527</MEM_BLOCK_HEADER_CRC>
         <MEM_BLOCK_CRC>0xCCB7B836</MEM_BLOCK_CRC>
         <MEM_BLOCK_CRC2>None</MEM_BLOCK_CRC2>
         <MEM_BLOCK_DATA> 
           <MEM_PAGE_DATA>4e564d2d52656648</MEM_PAGE_DATA> 
           <MEM_PAGE_DATA>6173685374617274</MEM_PAGE_DATA> 
           <MEM_PAGE_DATA>1782af065966579e</MEM_PAGE_DATA> 
           <MEM_PAGE_DATA>899885d440d3ad67</MEM_PAGE_DATA> 
           <MEM_PAGE_DATA>d04b41b15e2b13c2</MEM_PAGE_DATA> 
         </MEM_BLOCK_DATA>
       </MEM_BLOCK>
       <MEM_BLOCK>
         <MEM_BLOCK_ID>0x20F</MEM_BLOCK_ID>
         <MEM_BLOCK_NAME>UNKNOWN</MEM_BLOCK_NAME>
         <MEM_BLOCK_STATUS>VALID</MEM_BLOCK_STATUS>
         <MEM_BLOCK_FLAGS>0x0</MEM_BLOCK_FLAGS>
         <MEM_BLOCK_STORAGE>Emulation</MEM_BLOCK_STORAGE>
         <MEM_BLOCK_LEN>0x2</MEM_BLOCK_LEN>
         <MEM_BLOCK_VERSION>0x0</MEM_BLOCK_VERSION>
         <MEM_BLOCK_HEADER_CRC>0xE0D2</MEM_BLOCK_HEADER_CRC>
         <MEM_BLOCK_CRC>0xECFBBA1A</MEM_BLOCK_CRC>
         <MEM_BLOCK_CRC2>None</MEM_BLOCK_CRC2>
         <MEM_BLOCK_DATA> 
           <MEM_PAGE_DATA>0000</MEM_PAGE_DATA> 
         </MEM_BLOCK_DATA>
       </MEM_BLOCK>
       <MEM_BLOCK>
         <MEM_BLOCK_ID>0x1F8</MEM_BLOCK_ID>
         <MEM_BLOCK_NAME>UNKNOWN</MEM_BLOCK_NAME>
         <MEM_BLOCK_STATUS>VALID</MEM_BLOCK_STATUS>
         <MEM_BLOCK_FLAGS>0x0</MEM_BLOCK_FLAGS>
         <MEM_BLOCK_STORAGE>Emulation</MEM_BLOCK_STORAGE>
         <MEM_BLOCK_LEN>0x2</MEM_BLOCK_LEN>
         <MEM_BLOCK_VERSION>0x0</MEM_BLOCK_VERSION>
         <MEM_BLOCK_HEADER_CRC>0x1DCC</MEM_BLOCK_HEADER_CRC>
         <MEM_BLOCK_CRC>0xA54E2B5A</MEM_BLOCK_CRC>
         <MEM_BLOCK_CRC2>None</MEM_BLOCK_CRC2>
         <MEM_BLOCK_DATA> 
           <MEM_PAGE_DATA>30d4</MEM_PAGE_DATA> 
         </MEM_BLOCK_DATA>
       </MEM_BLOCK>
  </MEM_SECTOR>
</MEM_DATA>

假设我们在名为 test.xml 的文件中有此 xml 数据,您可以这样做:

import xml.etree.ElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()

def search_and_copy(query):
    for child in root.findall("MEM_SECTOR/MEM_BLOCK"):
        if child.find("MEM_BLOCK_CRC").text == query:
            return [item.text for item in child.findall("MEM_BLOCK_DATA/*")]

让我们试试这个 search_and_copy() 函数:

>>> search_and_copy("0xCCB7B836")
['4e564d2d52656648', '6173685374617274', '1782af065966579e', '899885d440d3ad67', 'd04b41b15e2b13c2']

>>> search_and_copy("0xA54E2B5A")
['30d4']

我们可以使用 xpath, with python's xml etree and elementpath 编写一个函数来检索数据:

下面代码的分解(在 elementpath.Selector 内):
1。第一行查找具有我们的搜索字符串的元素
2。第二行..后退一步获取父元素
3。从父元素开始,此行在父元素内搜索 MEM_PAGE_DATA。该元素包含我们真正感兴趣的数据。
4。其余代码只是从 matches

中提取文本
import xml.etree.ElementTree as ET
import elementpath

#wrapped the shared data into a test.xml file
root = ET.parse('test.xml').getroot()

def find_data(search_string):          
    selector = elementpath.Selector(f""".//*[text()='{search_string}'] 
                                        //..
                                        //MEM_PAGE_DATA""")
    #pull text from the match
    result = [entry.text for entry in selector.select(root)]
    return result

测试提供的字符串:

find_data("0xCCB7B836")

['4e564d2d52656648',
 '6173685374617274',
 '1782af065966579e',
 '899885d440d3ad67',
 'd04b41b15e2b13c2']


find_data("0xECFBBA1A")

['0000']

find_data("0xA54E2B5A")

['30d4']