将节点的属性值与子节点相关联
associate an attribute value of a node with a child node
我想将节点的属性值与多个 XML 文件的子节点相关联。
例如,我的 XML 文件中有这种结构:
<desc id="butwba10.1.wc.01" dbi="BUTWBA10.1.1.WC">
<objtitle>
<title type="transcribed">Title-Page Design for <hi rend="i">The Grave</hi>
</title>
<title type="alt">The Skeleton Re-Animated</title>
<title type="alt">A Characteristic Frontispiece</title>, <objid>
<objnumber code="A1">object 1 </objnumber>
</objid>
</objtitle>
<physdesc desclevel="brief">
<objsize>33.2 x 26.6 cm.</objsize>
<objnote>
adfadfa
</objnote>
<windowsize width="600" height="700"/>
</physdesc>
<related objectid="bb435.1.comdes.02"/>
<related objectid="but614r.1.penc.01"/>
<related objectid="but611.1.wc.01"/>
<related objectid="but612.1.wd.01"/>
<related objectid="bb515.1.comb.12"/>
</desc>
我想提取desc id (butwba10.1.wc.01)并将其与"related objectids"的集合相关联(bb335.1.comdes.02, but614r.1.penc.01, ...) 并将关联插入到文件中,如下所示:
butwba10.1.wc.01 related="bb335.1.comdes.02, but614.r.1.penc.01, ..."
我正在使用 bash。我可以做类似
的事情
xpath BUTWBA10.1.xml //bad/objdesc/desc//related > test
但是我没有与每组相关元素关联的描述 ID。
----更新----
@sputnik,我在 bash 文件中有这个:
`
#!/bin/bash
for f in *.xml
do
id=$(xml sel -t -v '//bad/objdesc/desc/@id' $f)
arr=( $(xml sel -t -v '//bad/objdesc/desc/related/@objectid' $f) )
cat<<EOF >> output.txt
$id related="$(printf '%s\n' "${arr[@]}" | paste -sd ', ')"
EOF
done
`
但输出显示:
butwba10.1.wc.01
butwba10.1.wc.02
butwba10.1.wc.03
butwba10.1.wc.04
butwba10.1.wc.05
butwba10.1.wc.06
butwba10.1.wc.07
butwba10.1.wc.08
butwba10.1.wc.09
butwba10.1.wc.10
butwba10.1.wc.11
butwba10.1.wc.12
butwba10.1.wc.13
butwba10.1.wc.14
butwba10.1.wc.15
butwba10.1.wc.16
butwba10.1.wc.17
butwba10.1.wc.18
butwba10.1.wc.19
butwba10.1.wc.20 related=""
------更新--------
@sputnick 我得到了这个输出。所有 "related" 元素都与 xml 文件的最后一个 desc id 集中在一起:
bb421.1.spb.01
bb421.1.spb.02
bb421.1.spb.03
bb421.1.spb.04
bb421.1.spb.05
bb421.1.spb.06
bb421.1.spb.07
bb421.1.spb.08
bb421.1.spb.09
bb421.1.spb.10
bb421.1.spb.11
bb421.1.spb.12
bb421.1.spb.13
bb421.1.spb.14
bb421.1.spb.15
bb421.1.spb.16
bb421.1.spb.17
bb421.1.spb.18
bb421.1.spb.19
bb421.1.spb.20
bb421.1.spb.21
bb421.1.spb.22
bb421.1.spb.23 related="but550.1.wc.01,but551.1.wc.01,but557.1.penc.05,but557.1.penc.06,but557.1.penc.07,but557.1.penc.30,but557.1.penc.31,but550.1.wc.02,but551.1.wc.02,but557.1.penc.08,but550.1.wc.03,but551.1.wc.03,but557.1.penc.09,but550.1.wc.04,but551.1.wc.04,but557.1.penc.10,but550.1.wc.05,but551.1.wc.05,but557.1.penc.11,but550.1.wc.06,but551.1.wc.06,but557.1.penc.04,but557.1.penc.12,but550.1.wc.07,but551.1.wc.07,but557.1.penc.13,but550.1.wc.08,but551.1.wc.08,but557.1.penc.04,but557.1.penc.14,but550.1.wc.09,but551.1.wc.09,but557.1.penc.15,but550.1.wc.10,but551.1.wc.10,but557.1.penc.16,but550.1.wc.11,but551.1.wc.11,but557.1.penc.17,but550.1.wc.12,but551.1.wc.12,but557.1.penc.18,but461.1.wc.01,but550.1.wc.13,but551.1.wc.13,but557.1.penc.19,but550.1.wc.14,but551.1.wc.14,but557.1.penc.20,but550.1.wc.15,but551.1.wc.15,but557.1.penc.21,but550.1.wc.16,but551.1.wc.16,but557.1.penc.22,but550.1.wc.17,but551.1.wc.17,but557.1.penc.23,but550.1.wc.18,but551.1.wc.18,but557.1.penc.02,but557.1.penc.24,but550.1.wc.19,but551.1.wc.19,but557.1.penc.26,but557.1.penc.28,but550.1.wc.20,but551.1.wc.20,but557.1.penc.25,but557.1.penc.29,but394.1.pt.01,but551.1.wc.21,but550.1.wc.21,but557.1.penc.27,but557.1.penc.30,but557.1.penc.31"
示例输出的完整实现:
#!/bin/bash
for x; do
id=$(xmlstarlet sel -t -v '/desc[1]/@id' "$x")
arr=( $(xmlstarlet sel -t -v '/desc[1]/related/@objectid' "$x") )
cat<<EOF >> new_file
$id related="$(perl -e 'print join ",", @ARGV' "${arr[@]}")"
EOF
done
用法:
chmod +x script.sh
./script.sh file1.xml file2.xml file3.xml ...
cat new_file
我想将节点的属性值与多个 XML 文件的子节点相关联。
例如,我的 XML 文件中有这种结构:
<desc id="butwba10.1.wc.01" dbi="BUTWBA10.1.1.WC">
<objtitle>
<title type="transcribed">Title-Page Design for <hi rend="i">The Grave</hi>
</title>
<title type="alt">The Skeleton Re-Animated</title>
<title type="alt">A Characteristic Frontispiece</title>, <objid>
<objnumber code="A1">object 1 </objnumber>
</objid>
</objtitle>
<physdesc desclevel="brief">
<objsize>33.2 x 26.6 cm.</objsize>
<objnote>
adfadfa
</objnote>
<windowsize width="600" height="700"/>
</physdesc>
<related objectid="bb435.1.comdes.02"/>
<related objectid="but614r.1.penc.01"/>
<related objectid="but611.1.wc.01"/>
<related objectid="but612.1.wd.01"/>
<related objectid="bb515.1.comb.12"/>
</desc>
我想提取desc id (butwba10.1.wc.01)并将其与"related objectids"的集合相关联(bb335.1.comdes.02, but614r.1.penc.01, ...) 并将关联插入到文件中,如下所示:
butwba10.1.wc.01 related="bb335.1.comdes.02, but614.r.1.penc.01, ..."
我正在使用 bash。我可以做类似
的事情xpath BUTWBA10.1.xml //bad/objdesc/desc//related > test
但是我没有与每组相关元素关联的描述 ID。
----更新----
@sputnik,我在 bash 文件中有这个:
`
#!/bin/bash
for f in *.xml
do
id=$(xml sel -t -v '//bad/objdesc/desc/@id' $f)
arr=( $(xml sel -t -v '//bad/objdesc/desc/related/@objectid' $f) )
cat<<EOF >> output.txt
$id related="$(printf '%s\n' "${arr[@]}" | paste -sd ', ')"
EOF
done
`
但输出显示:
butwba10.1.wc.01
butwba10.1.wc.02
butwba10.1.wc.03
butwba10.1.wc.04
butwba10.1.wc.05
butwba10.1.wc.06
butwba10.1.wc.07
butwba10.1.wc.08
butwba10.1.wc.09
butwba10.1.wc.10
butwba10.1.wc.11
butwba10.1.wc.12
butwba10.1.wc.13
butwba10.1.wc.14
butwba10.1.wc.15
butwba10.1.wc.16
butwba10.1.wc.17
butwba10.1.wc.18
butwba10.1.wc.19
butwba10.1.wc.20 related=""
------更新--------
@sputnick 我得到了这个输出。所有 "related" 元素都与 xml 文件的最后一个 desc id 集中在一起:
bb421.1.spb.01
bb421.1.spb.02
bb421.1.spb.03
bb421.1.spb.04
bb421.1.spb.05
bb421.1.spb.06
bb421.1.spb.07
bb421.1.spb.08
bb421.1.spb.09
bb421.1.spb.10
bb421.1.spb.11
bb421.1.spb.12
bb421.1.spb.13
bb421.1.spb.14
bb421.1.spb.15
bb421.1.spb.16
bb421.1.spb.17
bb421.1.spb.18
bb421.1.spb.19
bb421.1.spb.20
bb421.1.spb.21
bb421.1.spb.22
bb421.1.spb.23 related="but550.1.wc.01,but551.1.wc.01,but557.1.penc.05,but557.1.penc.06,but557.1.penc.07,but557.1.penc.30,but557.1.penc.31,but550.1.wc.02,but551.1.wc.02,but557.1.penc.08,but550.1.wc.03,but551.1.wc.03,but557.1.penc.09,but550.1.wc.04,but551.1.wc.04,but557.1.penc.10,but550.1.wc.05,but551.1.wc.05,but557.1.penc.11,but550.1.wc.06,but551.1.wc.06,but557.1.penc.04,but557.1.penc.12,but550.1.wc.07,but551.1.wc.07,but557.1.penc.13,but550.1.wc.08,but551.1.wc.08,but557.1.penc.04,but557.1.penc.14,but550.1.wc.09,but551.1.wc.09,but557.1.penc.15,but550.1.wc.10,but551.1.wc.10,but557.1.penc.16,but550.1.wc.11,but551.1.wc.11,but557.1.penc.17,but550.1.wc.12,but551.1.wc.12,but557.1.penc.18,but461.1.wc.01,but550.1.wc.13,but551.1.wc.13,but557.1.penc.19,but550.1.wc.14,but551.1.wc.14,but557.1.penc.20,but550.1.wc.15,but551.1.wc.15,but557.1.penc.21,but550.1.wc.16,but551.1.wc.16,but557.1.penc.22,but550.1.wc.17,but551.1.wc.17,but557.1.penc.23,but550.1.wc.18,but551.1.wc.18,but557.1.penc.02,but557.1.penc.24,but550.1.wc.19,but551.1.wc.19,but557.1.penc.26,but557.1.penc.28,but550.1.wc.20,but551.1.wc.20,but557.1.penc.25,but557.1.penc.29,but394.1.pt.01,but551.1.wc.21,but550.1.wc.21,but557.1.penc.27,but557.1.penc.30,but557.1.penc.31"
示例输出的完整实现:
#!/bin/bash
for x; do
id=$(xmlstarlet sel -t -v '/desc[1]/@id' "$x")
arr=( $(xmlstarlet sel -t -v '/desc[1]/related/@objectid' "$x") )
cat<<EOF >> new_file
$id related="$(perl -e 'print join ",", @ARGV' "${arr[@]}")"
EOF
done
用法:
chmod +x script.sh
./script.sh file1.xml file2.xml file3.xml ...
cat new_file