捕获 xml 标签并跟踪相关标签(如果存在)
Capture xml tag and following related tag if one exists
我想从下面的(剪辑的)XML 文件中提取程序标题和 sub-title。我使用 xmllint 和 sed 分别提取它们并将它们组合到一个文件中,但后来我发现偶尔会有条目只有标题而没有 sub-title。在这种情况下,我想将 sub-title 留空。请问有人可以建议一种方法来解释这种差异吗?
XML 文件
<programme start="20171013170000 +0100" stop="20171013180000 +0100" channel="b492458d826d592ec7c528545a16c757">
<title lang="eng">Accessories Gift Hall</title>
<sub-title lang="eng">Find the perfect gift with fashion accessories by some of our most sought-after brands. From chic purses and wallets to cosy PJs and slippers, there's something for everyone.</sub-title>
</programme>
<programme start="20171013180000 +0100" stop="20171014130000 +0100" channel="b492458d826d592ec7c528545a16c757">
<title lang="eng">..programmes start again at 1pm</title>
</programme>
<programme start="20171014130000 +0100" stop="20171014140000 +0100" channel="b492458d826d592ec7c528545a16c757">
<title lang="eng">Ruth Langsford's Fashion Edit</title>
<sub-title lang="eng">TV personality and QVC fashion ambassador, Ruth Langsford, shares her favourite looks and must-have pieces that will transform your wardrobe and have you looking fabulously stylish.</sub-title>
</programme>
Bash 命令 v1
xmllint --xpath "//programme/title" xmltv | sed -r 's/\n//g' | sed 's/<\/title>/\n/g' | sed 's/<title lang="eng">//g' > 1.txt
xmllint --xpath "//programme/sub-title" xmltv | sed -r 's/\n//g' | sed 's/<\/sub-title>/\n/g' | sed 's/<sub-title lang="eng">//g' > 2.txt
paste <(cat 1.txt) <(cat 2.txt) > 3.txt
谢谢!
我会做什么:
#!/bin/bash
count=$(xmllint --xpath "count(//programme)" /tmp/file.xml)
for ((i=1; i<=count; i++)); do
xmllint --xpath "//programme[$i]/title/text()" /tmp/file.xml
echo -n '|'
xmllint --xpath "//programme[$i]/sub-title/text()" /tmp/file.xml
echo
done
通过 sed 一次传递
sed '/<title/!d;N;/<sub-title/!s/\n.*//' XML File
下面是一个使用命令行 sel
command of xmlstarlet
的示例...
$ xmlstarlet sel -T -t -m '//programme' -v 'concat(normalize-space(title)," ",normalize-space(sub-title))' -n input.xml
Accessories Gift Hall Find the perfect gift with fashion accessories by some of our most sought-after brands. From chic purses and wallets to cosy PJs and slippers, there's something for everyone.
..programmes start again at 1pm
Ruth Langsford's Fashion Edit TV personality and QVC fashion ambassador, Ruth Langsford, shares her favourite looks and must-have pieces that will transform your wardrobe and have you looking fabulously stylish.
我用一个 space 将标题和 sub-title 分开,但这可以更改。
我想从下面的(剪辑的)XML 文件中提取程序标题和 sub-title。我使用 xmllint 和 sed 分别提取它们并将它们组合到一个文件中,但后来我发现偶尔会有条目只有标题而没有 sub-title。在这种情况下,我想将 sub-title 留空。请问有人可以建议一种方法来解释这种差异吗?
XML 文件
<programme start="20171013170000 +0100" stop="20171013180000 +0100" channel="b492458d826d592ec7c528545a16c757">
<title lang="eng">Accessories Gift Hall</title>
<sub-title lang="eng">Find the perfect gift with fashion accessories by some of our most sought-after brands. From chic purses and wallets to cosy PJs and slippers, there's something for everyone.</sub-title>
</programme>
<programme start="20171013180000 +0100" stop="20171014130000 +0100" channel="b492458d826d592ec7c528545a16c757">
<title lang="eng">..programmes start again at 1pm</title>
</programme>
<programme start="20171014130000 +0100" stop="20171014140000 +0100" channel="b492458d826d592ec7c528545a16c757">
<title lang="eng">Ruth Langsford's Fashion Edit</title>
<sub-title lang="eng">TV personality and QVC fashion ambassador, Ruth Langsford, shares her favourite looks and must-have pieces that will transform your wardrobe and have you looking fabulously stylish.</sub-title>
</programme>
Bash 命令 v1
xmllint --xpath "//programme/title" xmltv | sed -r 's/\n//g' | sed 's/<\/title>/\n/g' | sed 's/<title lang="eng">//g' > 1.txt
xmllint --xpath "//programme/sub-title" xmltv | sed -r 's/\n//g' | sed 's/<\/sub-title>/\n/g' | sed 's/<sub-title lang="eng">//g' > 2.txt
paste <(cat 1.txt) <(cat 2.txt) > 3.txt
谢谢!
我会做什么:
#!/bin/bash
count=$(xmllint --xpath "count(//programme)" /tmp/file.xml)
for ((i=1; i<=count; i++)); do
xmllint --xpath "//programme[$i]/title/text()" /tmp/file.xml
echo -n '|'
xmllint --xpath "//programme[$i]/sub-title/text()" /tmp/file.xml
echo
done
通过 sed 一次传递
sed '/<title/!d;N;/<sub-title/!s/\n.*//' XML File
下面是一个使用命令行 sel
command of xmlstarlet
的示例...
$ xmlstarlet sel -T -t -m '//programme' -v 'concat(normalize-space(title)," ",normalize-space(sub-title))' -n input.xml
Accessories Gift Hall Find the perfect gift with fashion accessories by some of our most sought-after brands. From chic purses and wallets to cosy PJs and slippers, there's something for everyone.
..programmes start again at 1pm
Ruth Langsford's Fashion Edit TV personality and QVC fashion ambassador, Ruth Langsford, shares her favourite looks and must-have pieces that will transform your wardrobe and have you looking fabulously stylish.
我用一个 space 将标题和 sub-title 分开,但这可以更改。