以编程方式交换两个文本块
Swap two blocks of text programatically
我有一个 XML 文件,由多个非常相似的块组成。这里有两个:
<Grid Name="EMFieldMany" GridType="Uniform">
<Topology TopologyType="3DRectMesh" Dimensions="40 40 40 "/>
<Attribute AttributeType="Scalar" Name="Er" Center="Node">
<DataItem Dimensions="40 40 40 " NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/field/Er-0
</DataItem>
</Attribute>
<Geometry GeometryType="VXVYVZ">
<DataItem Name="r" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/r
</DataItem>
<DataItem Name="theta" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/theta
</DataItem>
<DataItem Name="z" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/z
</DataItem>
</Geometry>
</Grid>
<Grid Name="EMFieldMany" GridType="Uniform">
<Topology TopologyType="3DRectMesh" Dimensions="40 40 40 "/>
<Attribute AttributeType="Scalar" Name="Er" Center="Node">
<DataItem Dimensions="40 40 40 " NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/field/Er-1
</DataItem>
</Attribute>
<Geometry GeometryType="VXVYVZ">
<DataItem Name="r" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/r
</DataItem>
<DataItem Name="theta" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/theta
</DataItem>
<DataItem Name="z" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/z
</DataItem>
</Geometry>
</Grid>
通常我在一个文件中有数百个相似的 <Grid>
对象。现在,我想以编程方式交换每个 <Grid>
对象中 <DataItem Name="r">
和 <DataItem Name="z">
块的位置,以便 <DataItem>
的顺序为 z、theta、r。此外,对于每个 Dimensions=" x y z "
语句,每个包含三个值的 Dimensions
属性,我希望将属性重写为 Dimensions=" z y x "
.
我真的不介意用于执行此操作的编程语言。我在 Linux 工作站上 bash、python、perl...所有标准的东西。
编辑:这个 answer uses sed
to match blocks of text, but I'm not sure how to manipulate the selected block afterwards. This other answer 上下交换单行,但我不确定如何推广到文本块,并使其交换块。
如果您的 XML 是一个字符串,您可以使用替换来做到这一点:
#Swapping ( '\"' the slash escape the " to make it the string caracter)
XML_str = XML_str.replace("\"r","SomethingThatNeverAppearInTheXML")
XML_str = XML_str.replace("\"z","\r")
XML_str = XML_str.replace("SomethingThatNeverAppearInTheXML","\"z")
#Replacing
XML_str = XML_str.replace("x y z","z y x")
输入
XML_str = """
<Grid Name="EMFieldMany" GridType="Uniform">
<Topology TopologyType="3DRectMesh" Dimensions="40 40 40 "/>
<Attribute AttributeType="Scalar" Name="Er" Center="Node">
<DataItem Dimensions="40 40 40 " NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/field/Er-0
</DataItem>
</Attribute>
<Geometry GeometryType="VXVYVZ">
<DataItem Name="r" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/r
</DataItem>
<DataItem Name="theta" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/theta
</DataItem>
<DataItem Name="z" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/z
</DataItem>
</Geometry>
</Grid>
<Grid Name="EMFieldMany" GridType="Uniform">
<Topology TopologyType="3DRectMesh" Dimensions="40 40 40 "/>
<Attribute AttributeType="Scalar" Name="Er" Center="Node">
<DataItem Dimensions="40 40 40 " NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/field/Er-1
</DataItem>
</Attribute>
<Geometry GeometryType="VXVYVZ">
<DataItem Name="r" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/r
</DataItem>
<DataItem Name="theta" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/theta
</DataItem>
<DataItem Name="z" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/z
</DataItem>
</Geometry>
</Grid>
"""
输出
<Grid Name="EMFieldMany" GridType="Uniform">
<Topology TopologyType="3DRectMesh" Dimensions="40 40 40 "/>
<Attribute AttributeType="Scalar" Name="Er" Center="Node">
<DataItem Dimensions="40 40 40 " NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/field/Er-0
</DataItem>
</Attribute>
<Geometry GeometryType="VXVYVZ">
<DataItem Name="z" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/r
</DataItem>
<DataItem Name="theta" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/theta
</DataItem>
<DataItem Name="r" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/z
</DataItem>
</Geometry>
</Grid>
<Grid Name="EMFieldMany" GridType="Uniform">
<Topology TopologyType="3DRectMesh" Dimensions="40 40 40 "/>
<Attribute AttributeType="Scalar" Name="Er" Center="Node">
<DataItem Dimensions="40 40 40 " NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/field/Er-1
</DataItem>
</Attribute>
<Geometry GeometryType="VXVYVZ">
<DataItem Name="z" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/r
</DataItem>
<DataItem Name="theta" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/theta
</DataItem>
<DataItem Name="r" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/z
</DataItem>
</Geometry>
</Grid>
正如您提到的 sed,我建议使用此 perl 解决方案,但最好使用 xml 解析器解析 xml。
#!/usr/bin/perl
# changing input line separator
$/="</Grid>";
while ( $_=<> ) {
s@(\s*<DataItem Name="r".*?</DataItem>)(\s*<DataItem Name="theta".*?</DataItem>)(\s*<DataItem Name="z".*?</DataItem>)@@s;
s@<DataItem Dimensions="\K(\d+) (\d+) (\d+) @ @;
print;
}
或等价的一行
perl -pe 'BEGIN{$/="</Grid>"}s@(\s*<DataItem Name="r".*?</DataItem>)(\s*<DataItem Name="theta".*?</DataItem>)(\s*<DataItem Name="z".*?</DataItem>)@@s;s@<DataItem Dimensions="\K(\d+) (\d+) (\d+) @ @;' <input.txt
这可以通过 ed 轻松完成:
g/<DataItem Name="r"/-ka\
/<DataItem Name="z"/\
-kb\
.,/\/DataItem/m'a\
+1,/\/DataItem/m'b
我有一个 XML 文件,由多个非常相似的块组成。这里有两个:
<Grid Name="EMFieldMany" GridType="Uniform">
<Topology TopologyType="3DRectMesh" Dimensions="40 40 40 "/>
<Attribute AttributeType="Scalar" Name="Er" Center="Node">
<DataItem Dimensions="40 40 40 " NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/field/Er-0
</DataItem>
</Attribute>
<Geometry GeometryType="VXVYVZ">
<DataItem Name="r" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/r
</DataItem>
<DataItem Name="theta" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/theta
</DataItem>
<DataItem Name="z" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/z
</DataItem>
</Geometry>
</Grid>
<Grid Name="EMFieldMany" GridType="Uniform">
<Topology TopologyType="3DRectMesh" Dimensions="40 40 40 "/>
<Attribute AttributeType="Scalar" Name="Er" Center="Node">
<DataItem Dimensions="40 40 40 " NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/field/Er-1
</DataItem>
</Attribute>
<Geometry GeometryType="VXVYVZ">
<DataItem Name="r" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/r
</DataItem>
<DataItem Name="theta" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/theta
</DataItem>
<DataItem Name="z" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/z
</DataItem>
</Geometry>
</Grid>
通常我在一个文件中有数百个相似的 <Grid>
对象。现在,我想以编程方式交换每个 <Grid>
对象中 <DataItem Name="r">
和 <DataItem Name="z">
块的位置,以便 <DataItem>
的顺序为 z、theta、r。此外,对于每个 Dimensions=" x y z "
语句,每个包含三个值的 Dimensions
属性,我希望将属性重写为 Dimensions=" z y x "
.
我真的不介意用于执行此操作的编程语言。我在 Linux 工作站上 bash、python、perl...所有标准的东西。
编辑:这个 answer uses sed
to match blocks of text, but I'm not sure how to manipulate the selected block afterwards. This other answer 上下交换单行,但我不确定如何推广到文本块,并使其交换块。
如果您的 XML 是一个字符串,您可以使用替换来做到这一点:
#Swapping ( '\"' the slash escape the " to make it the string caracter)
XML_str = XML_str.replace("\"r","SomethingThatNeverAppearInTheXML")
XML_str = XML_str.replace("\"z","\r")
XML_str = XML_str.replace("SomethingThatNeverAppearInTheXML","\"z")
#Replacing
XML_str = XML_str.replace("x y z","z y x")
输入
XML_str = """
<Grid Name="EMFieldMany" GridType="Uniform">
<Topology TopologyType="3DRectMesh" Dimensions="40 40 40 "/>
<Attribute AttributeType="Scalar" Name="Er" Center="Node">
<DataItem Dimensions="40 40 40 " NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/field/Er-0
</DataItem>
</Attribute>
<Geometry GeometryType="VXVYVZ">
<DataItem Name="r" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/r
</DataItem>
<DataItem Name="theta" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/theta
</DataItem>
<DataItem Name="z" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/z
</DataItem>
</Geometry>
</Grid>
<Grid Name="EMFieldMany" GridType="Uniform">
<Topology TopologyType="3DRectMesh" Dimensions="40 40 40 "/>
<Attribute AttributeType="Scalar" Name="Er" Center="Node">
<DataItem Dimensions="40 40 40 " NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/field/Er-1
</DataItem>
</Attribute>
<Geometry GeometryType="VXVYVZ">
<DataItem Name="r" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/r
</DataItem>
<DataItem Name="theta" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/theta
</DataItem>
<DataItem Name="z" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/z
</DataItem>
</Geometry>
</Grid>
"""
输出
<Grid Name="EMFieldMany" GridType="Uniform">
<Topology TopologyType="3DRectMesh" Dimensions="40 40 40 "/>
<Attribute AttributeType="Scalar" Name="Er" Center="Node">
<DataItem Dimensions="40 40 40 " NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/field/Er-0
</DataItem>
</Attribute>
<Geometry GeometryType="VXVYVZ">
<DataItem Name="z" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/r
</DataItem>
<DataItem Name="theta" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/theta
</DataItem>
<DataItem Name="r" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/z
</DataItem>
</Geometry>
</Grid>
<Grid Name="EMFieldMany" GridType="Uniform">
<Topology TopologyType="3DRectMesh" Dimensions="40 40 40 "/>
<Attribute AttributeType="Scalar" Name="Er" Center="Node">
<DataItem Dimensions="40 40 40 " NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/field/Er-1
</DataItem>
</Attribute>
<Geometry GeometryType="VXVYVZ">
<DataItem Name="z" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/r
</DataItem>
<DataItem Name="theta" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/theta
</DataItem>
<DataItem Name="r" Dimensions="40" NumberType="Float" Precision="8" Format="HDF5" Endian="Big">
Field_reflected_time_3D.hdf5:/coordinates/z
</DataItem>
</Geometry>
</Grid>
正如您提到的 sed,我建议使用此 perl 解决方案,但最好使用 xml 解析器解析 xml。
#!/usr/bin/perl
# changing input line separator
$/="</Grid>";
while ( $_=<> ) {
s@(\s*<DataItem Name="r".*?</DataItem>)(\s*<DataItem Name="theta".*?</DataItem>)(\s*<DataItem Name="z".*?</DataItem>)@@s;
s@<DataItem Dimensions="\K(\d+) (\d+) (\d+) @ @;
print;
}
或等价的一行
perl -pe 'BEGIN{$/="</Grid>"}s@(\s*<DataItem Name="r".*?</DataItem>)(\s*<DataItem Name="theta".*?</DataItem>)(\s*<DataItem Name="z".*?</DataItem>)@@s;s@<DataItem Dimensions="\K(\d+) (\d+) (\d+) @ @;' <input.txt
这可以通过 ed 轻松完成:
g/<DataItem Name="r"/-ka\
/<DataItem Name="z"/\
-kb\
.,/\/DataItem/m'a\
+1,/\/DataItem/m'b