使用 xmlstarlet 有条件地提取 XML 个属性
Conditional extraction of XML attributes with xmlstarlet
我有一些 XML(例如,文件 minimal.xml),其中包含以下格式的错误和警告消息:
<?xml version="1.0" encoding="UTF-8"?>
<messages>
<message subMessage="RSC-004">RSC-004, ERROR, [File 'OEBPS/Text/pdfMigration.html' could not be decrypted.], epub20_encryption_binary_content.epub</message>
<message subMessage="RSC-012">RSC-012, ERROR, [Fragment identifier is not defined.], OEBPS/toc.ncx (24-67)</message>
<message subMessage="RSC-012">RSC-012, ERROR, [Fragment identifier is not defined.], OEBPS/toc.ncx (30-82)</message>
<message subMessage="RSC-012">RSC-012, ERROR, [Fragment identifier is not defined.], OEBPS/toc.ncx (36-81)</message>
<message subMessage="RSC-012">RSC-012, ERROR, [Fragment identifier is not defined.], OEBPS/toc.ncx (42-75)</message>
<message subMessage="RSC-012">RSC-012, ERROR, [Fragment identifier is not defined.], OEBPS/toc.ncx (48-61)</message>
<message subMessage="HTM-023">HTM-023, WARN, [An invalid XHTML Named Entity was found: '&0;'.], OEBPS/Text/pdfMigration.html (18-199)</message>
<message subMessage="HTM-023">HTM-023, WARN, [An invalid XHTML Named Entity was found: '&l0xb'.], OEBPS/Text/pdfMigration.html (291-6)</message>
</messages>
我正在寻找一种方法来提取所有 message 元素的 subMessage 属性值从 message 元素的文本值中存在 ERROR 来识别)。我正在使用 xmlstarlet。经过一番搜索,我找到了,所以我将其调整如下:
xmlstarlet sel -t -v '/messages[contains(message,"ERROR")]/message/@subMessage' minimal.xml
结果:
RSC-004
RSC-012
RSC-012
RSC-012
RSC-012
RSC-012
HTM-023
HTM-023
这不是我所期望的,因为这些是 all 消息元素的 subMessage 值!作为进一步测试,我修改了查询以仅提取警告:
xmlstarlet sel -t -v '/messages[contains(message,"WARN")]/message/@subMessage' minimal.xml
在这种情况下,结果是空的!我是 xmlstarlet 的新手,我怀疑我在这里忽略了一些明显的东西。非常感谢任何帮助!
顺便说一句,关于我正在使用的 xmlstarlet 版本的一些信息:
compiled against libxml2 2.9.2, linked with 20903
compiled against libxslt 1.1.28, linked with 10128
试试这个
xmlstarlet sel -t -v '/messages/message[contains(.,"ERROR")]/@subMessage' minimal.xml
对于 /messages[contains(message,"WARN")]
,您错误地尝试检查 messages
元素的内容,而不是每个 message
元素的内容。
您需要将谓词移动到 message
,如下所示:
xmlstarlet sel -t -v "/messages/message[contains(.,'WARN')]/@subMessage" minimal.xml
我有一些 XML(例如,文件 minimal.xml),其中包含以下格式的错误和警告消息:
<?xml version="1.0" encoding="UTF-8"?>
<messages>
<message subMessage="RSC-004">RSC-004, ERROR, [File 'OEBPS/Text/pdfMigration.html' could not be decrypted.], epub20_encryption_binary_content.epub</message>
<message subMessage="RSC-012">RSC-012, ERROR, [Fragment identifier is not defined.], OEBPS/toc.ncx (24-67)</message>
<message subMessage="RSC-012">RSC-012, ERROR, [Fragment identifier is not defined.], OEBPS/toc.ncx (30-82)</message>
<message subMessage="RSC-012">RSC-012, ERROR, [Fragment identifier is not defined.], OEBPS/toc.ncx (36-81)</message>
<message subMessage="RSC-012">RSC-012, ERROR, [Fragment identifier is not defined.], OEBPS/toc.ncx (42-75)</message>
<message subMessage="RSC-012">RSC-012, ERROR, [Fragment identifier is not defined.], OEBPS/toc.ncx (48-61)</message>
<message subMessage="HTM-023">HTM-023, WARN, [An invalid XHTML Named Entity was found: '&0;'.], OEBPS/Text/pdfMigration.html (18-199)</message>
<message subMessage="HTM-023">HTM-023, WARN, [An invalid XHTML Named Entity was found: '&l0xb'.], OEBPS/Text/pdfMigration.html (291-6)</message>
</messages>
我正在寻找一种方法来提取所有 message 元素的 subMessage 属性值从 message 元素的文本值中存在 ERROR 来识别)。我正在使用 xmlstarlet。经过一番搜索,我找到了
xmlstarlet sel -t -v '/messages[contains(message,"ERROR")]/message/@subMessage' minimal.xml
结果:
RSC-004
RSC-012
RSC-012
RSC-012
RSC-012
RSC-012
HTM-023
HTM-023
这不是我所期望的,因为这些是 all 消息元素的 subMessage 值!作为进一步测试,我修改了查询以仅提取警告:
xmlstarlet sel -t -v '/messages[contains(message,"WARN")]/message/@subMessage' minimal.xml
在这种情况下,结果是空的!我是 xmlstarlet 的新手,我怀疑我在这里忽略了一些明显的东西。非常感谢任何帮助!
顺便说一句,关于我正在使用的 xmlstarlet 版本的一些信息:
compiled against libxml2 2.9.2, linked with 20903 compiled against libxslt 1.1.28, linked with 10128
试试这个
xmlstarlet sel -t -v '/messages/message[contains(.,"ERROR")]/@subMessage' minimal.xml
对于 /messages[contains(message,"WARN")]
,您错误地尝试检查 messages
元素的内容,而不是每个 message
元素的内容。
您需要将谓词移动到 message
,如下所示:
xmlstarlet sel -t -v "/messages/message[contains(.,'WARN')]/@subMessage" minimal.xml