如何在 XML 文件中的 CDATA 前后添加 space

How to add space before and after CDATA in XML file

我想创建一个函数来修改 XML 内容而不改变格式。我设法更改了文本,但如果不更改 XML 中的格式,我就做不到。 所以现在,我想做的是在 XML 文件中的 CDATA 前后添加 space。

默认XML文件:

<?xml version="1.0" encoding="utf-8"?>
<Mapsxmlns="http://www.semi.org">
  <Map>
    <Device>
      <ReferenceDevice/>
      <Bin>
        <Bin Bin="001"/>
      </Bin>
      <Data>
        <Row> <![CDATA[001 001 001]]> </Row>
      </Data>
    </Device>
  </Map>
</Maps>

我得到了这个结果:

<?xml version="1.0" encoding="utf-8"?>
<Mapsxmlns="http://www.semi.org">
  <Map>
    <Device>
      <ReferenceDevice/>
      <Bin>
        <Bin Bin="001"/>
      </Bin>
      <Data>
        <Row><![CDATA[001 001 099]]></Row>
      </Data>
    </Device>
  </Map>
</Maps>

但是,我希望新的 xml 是这样的:

<?xml version="1.0" encoding="utf-8"?>
<Mapsxmlns="http://www.semi.org">
  <Map>
    <Device>
      <ReferenceDevice/>
      <Bin>
        <Bin Bin="001"/>
      </Bin>
      <Data>
        <Row> <![CDATA[001 001 099]]> </Row>
      </Data>
    </Device>
  </Map>
</Maps>

这是我的代码:

from lxml import etree as ET

def xml_new(f,fpath,newtext,xmlrow):
    xmlrow = 19
    parser = ET.XMLParser(strip_cdata=False)
    tree = ET.parse(f, parser)
    root = tree.getroot()
    for child in root:
       value = child[0][2][xmlrow].text

    text = ET.CDATA("001 001 099")
    child[0][2][xmlrow] = ET.Element('Row')
    child[0][2][xmlrow].text = text
    child[0][2][xmlrow].tail = "\n"
    ET.register_namespace('A', "http://www.semi.org")
    tree.write(fpath,encoding='utf-8',xml_declaration=True)
    return value

任何人都可以帮助我吗?提前致谢!

不太明白你想做什么。这是给你的一个例子。不知道能不能满足你的需求

from simplified_scrapy import SimplifiedDoc,req,utils
html ='''<?xml version="1.0" encoding="utf-8"?>
<Mapsxmlns="http://www.semi.org">
  <Map>
    <Device>
      <ReferenceDevice/>
      <Bin>
        <Bin Bin="001"/>
      </Bin>
      <Data>
        <Row> <![CDATA[001 001 001]]> </Row>
      </Data>
    </Device>
  </Map>
</Maps>'''
doc = SimplifiedDoc(html)
row = doc.Data.Row # Get the node you want to modify.
row.setContent(" "+row.html+" ") # Modify the node content.
print (doc.html)

结果:

<?xml version="1.0" encoding="utf-8"?>
<Mapsxmlns="http://www.semi.org">
  <Map>
    <Device>
      <ReferenceDevice />
      <Bin>
        <Bin Bin="001" />
      </Bin>
      <Data>
        <Row>  <![CDATA[001 001 001]]>  </Row>
      </Data>
    </Device>
  </Map>
</Maps>

感谢您的帮助。我找到了另一种方法来达到我想要的结果

这是代码:

# what you want to change
replaceby = '020]]> </Row>\n'
# row you want to change
row = 1
# col you want to change based on list
col = 3
file = open(file,'r')
line = file.readlines()
i = 0
editedXML=[]
for l in line:
    if 'cdata' in l.lower():
        i=i+1
        if i == row:
            oldVal = l.split(' ')
            newVal = []
            for index, old in enumerate(oldVal):
                if index == col:
                    newVal.append(replaceby)
                else:
                    newVal.append(old)
            editedXML.append(' '.join(newVal))
        else:
            editedXML.append(l)
    else:
        editedXML.append(l)
file2 = open(newfile,'w')
file2.write(''.join(editedXML))
file2.close()