无法解析 xml 文件

Question

上周我曾经运行这个文件，它运行得很完美，但今天突然它不起作用了，我没有对它做任何改变。你能帮我吗？

这是我的 python 代码：

from lxml import etree
import numpy as np

#Parsing the xml file and creating lists
tree = etree.parse("AnyConv.com__CCSOPM Section 1_Master (1).xml")
root = tree.getroot()
Lista = []
tags = []

#Get the unique tags values
for element in root.iter():
    Lista.append(element.tag)
tags = np.unique(Lista)

#Show the unique tag[attributes] pairs
for tag in tags:
    print(tag,root.xpath(f'//{tag}')[0].attrib.keys())
    
#Changes the tag name to the comply365's tag's name
for p in tree.findall(".//sect1"):
    p.tag = ("section")
for p in tree.findall(".//sect1"):
    p.tag = ("section")
for p in tree.findall(".//informaltable"):
    p.tag = ("table")    
    
#Modify the tag's attributes to its desired form
for cy in root.xpath('//section'):
    cy.attrib['xmlns']='http://www.w3.org/2001/XMLSchema-instance'
    cy.attrib['id']='123'
    cy.attrib['type']='policy'
    cy.attrib['xsi']='urn:fontoxml:cpa.xsd:1.0'


for t in root.xpath('//title'):
    t.attrib['id']='123456789'
    
for p in root.xpath('//para'):
    p.attrib['id']='987654321'
    
for p in root.xpath('//table'):
    p.attrib['id']='11111'
    
for ct in root.xpath('//concept'):
    ct.attrib.pop("id", None)

#Print the new xml to make sure it worked:
#print(etree.tostring(root).decode())

    
tree.write("Resultado de tags XML-COMPLY365.xml")

这现在导致：

OSError: Error reading file 'AnyConv.com__CCSOPM Section 1_Master (1).xml': failed to load external entity "AnyConv.com__CCSOPM Section 1_Master (1).xml"

如果您有任何解决方法，请随时在下面发表评论。

Answer 1

文件名中的

(1) 部分表明您已经拥有 AnyConv.com__CCSOPM Section 1_Master .xml 文件在您的计算机和您试图再次将具有此名称的文件复制到您的编译器。

另请注意，您的文件名在 (1) 之前包含一个 space，这是奇怪的。这反过来表明您的“第一个”文件的名称（没有 (1)） 以 space 结尾（这非常奇怪）。

验证您的文件是否确实存在。或者将其名称更改为 AnyConv.com__CCSOPM Section 1_Master.xml，即：

没有尾随 space,
最后没有 (1)。然后在您的代码中相应地更改文件名。

另请注意，您的文件名包含双下划线，这也是一种奇怪的做法。验证你的文件名中的这个下划线是否实际上是 double.

在您的计算机中保留许多同名文件（带有“数字后缀”在其名称中）也是一种不好的做法。更糟糕的做法是在您的代码中引用此类“重复”文件。将文件名更改为没有这种“数字后缀”的文件名（在此 case (1)) 并将代码中的文件名设置为相同。

还有一个提示：不管你的文件名是什么：

在 文件资源管理器 window,
按F2键，你打开这个文件名的版本，但是到目前为止标记的文本不包括文件扩展名（在本例中 ".xml"),
按Ctrl-A将文本标记扩展到整个名字（带扩展名）。
按Ctrl-C复制文件名到剪贴板,
按Esc关闭文件名编辑，
打开您的代码编辑器，将光标放在您的文件名所在的位置，然后标记整个文件名，
按Ctrl-V将其替换为剪贴板内容。

现在在您的代码中您将拥有实际的文件名。

还要检查包含此文件的目录是否在列表中尝试打开文件时 Python 扫描的目录数。

无法解析 xml 文件

Cannot parse xml file

python

xml

lxml

xml-parsing