如何在 R 中读取带有初始标签的 XML 文件
How to read XML files with initial tags in R
我有几个 XML 文件缺少初始标记。例如,这是正确格式的文件:-
<?xml version="1.0"?>
<UDI>
<Test_Equipment_Number>3300061-01</Test_Equipment_Number>
<Test_SW_Number>3300062</Test_SW_Number>
<Test_SW_Version>2.1</Test_SW_Version>
<GTIN>(01)00884838088597</GTIN>
<LOT></LOT>
<Date_of_Mfg>(11)20190322</Date_of_Mfg>
<Device_SN>(21)1160001242</Device_SN>
<Material_Number>(96)300001287651</Material_Number>
<PCBA_WO_and_SN>00190311-0001242</PCBA_WO_and_SN>
<FW_Version>06</FW_Version>
<Model>324PHB</Model>
</UDI>
这是缺少初始标记的文件:-
<Test_Equipment_Number>3300011-01</Test_Equipment_Number>
<Test_SW_Number>3300012</Test_SW_Number>
<Test_SW_Version>5.1</Test_SW_Version>
<GTIN>(01)00884838085497</GTIN>
<LOT></LOT>
<Date_of_Mfg>(11)20190411</Date_of_Mfg>
<Device_SN>(21)1120104548</Device_SN>
<Material_Number>(96)300000267981</Material_Number>
<PCBA_WO_and_SN>000143-00000793</PCBA_WO_and_SN>
<FW_Version>V01.0001</FW_Version>
<Model>7000PHW</Model>
如何在 R 编程语言中读取缺少初始标记的文件?
一个选项是通过指定要添加的顶部节点来解析 xml 片段:
# install.packages('XML')
library(XML)
fragment <-
'<Test_Equipment_Number>3300011-01</Test_Equipment_Number>
<Test_SW_Number>3300012</Test_SW_Number>
<Test_SW_Version>5.1</Test_SW_Version>
<GTIN>(01)00884838085497</GTIN>
<LOT></LOT>
<Date_of_Mfg>(11)20190411</Date_of_Mfg>
<Device_SN>(21)1120104548</Device_SN>
<Material_Number>(96)300000267981</Material_Number>
<PCBA_WO_and_SN>000143-00000793</PCBA_WO_and_SN>
<FW_Version>V01.0001</FW_Version>
<Model>7000PHW</Model>'
XML::parseXMLAndAdd(fragment, top = 'content')
#> <content>
#> <Test_Equipment_Number>3300011-01</Test_Equipment_Number>
#> <Test_SW_Number>3300012</Test_SW_Number>
#> <Test_SW_Version>5.1</Test_SW_Version>
#> <GTIN>(01)00884838085497</GTIN>
#> <LOT/>
#> <Date_of_Mfg>(11)20190411</Date_of_Mfg>
#> <Device_SN>(21)1120104548</Device_SN>
#> <Material_Number>(96)300000267981</Material_Number>
#> <PCBA_WO_and_SN>000143-00000793</PCBA_WO_and_SN>
#> <FW_Version>V01.0001</FW_Version>
#> <Model>7000PHW</Model>
#> </content>
我有几个 XML 文件缺少初始标记。例如,这是正确格式的文件:-
<?xml version="1.0"?>
<UDI>
<Test_Equipment_Number>3300061-01</Test_Equipment_Number>
<Test_SW_Number>3300062</Test_SW_Number>
<Test_SW_Version>2.1</Test_SW_Version>
<GTIN>(01)00884838088597</GTIN>
<LOT></LOT>
<Date_of_Mfg>(11)20190322</Date_of_Mfg>
<Device_SN>(21)1160001242</Device_SN>
<Material_Number>(96)300001287651</Material_Number>
<PCBA_WO_and_SN>00190311-0001242</PCBA_WO_and_SN>
<FW_Version>06</FW_Version>
<Model>324PHB</Model>
</UDI>
这是缺少初始标记的文件:-
<Test_Equipment_Number>3300011-01</Test_Equipment_Number>
<Test_SW_Number>3300012</Test_SW_Number>
<Test_SW_Version>5.1</Test_SW_Version>
<GTIN>(01)00884838085497</GTIN>
<LOT></LOT>
<Date_of_Mfg>(11)20190411</Date_of_Mfg>
<Device_SN>(21)1120104548</Device_SN>
<Material_Number>(96)300000267981</Material_Number>
<PCBA_WO_and_SN>000143-00000793</PCBA_WO_and_SN>
<FW_Version>V01.0001</FW_Version>
<Model>7000PHW</Model>
如何在 R 编程语言中读取缺少初始标记的文件?
一个选项是通过指定要添加的顶部节点来解析 xml 片段:
# install.packages('XML')
library(XML)
fragment <-
'<Test_Equipment_Number>3300011-01</Test_Equipment_Number>
<Test_SW_Number>3300012</Test_SW_Number>
<Test_SW_Version>5.1</Test_SW_Version>
<GTIN>(01)00884838085497</GTIN>
<LOT></LOT>
<Date_of_Mfg>(11)20190411</Date_of_Mfg>
<Device_SN>(21)1120104548</Device_SN>
<Material_Number>(96)300000267981</Material_Number>
<PCBA_WO_and_SN>000143-00000793</PCBA_WO_and_SN>
<FW_Version>V01.0001</FW_Version>
<Model>7000PHW</Model>'
XML::parseXMLAndAdd(fragment, top = 'content')
#> <content>
#> <Test_Equipment_Number>3300011-01</Test_Equipment_Number>
#> <Test_SW_Number>3300012</Test_SW_Number>
#> <Test_SW_Version>5.1</Test_SW_Version>
#> <GTIN>(01)00884838085497</GTIN>
#> <LOT/>
#> <Date_of_Mfg>(11)20190411</Date_of_Mfg>
#> <Device_SN>(21)1120104548</Device_SN>
#> <Material_Number>(96)300000267981</Material_Number>
#> <PCBA_WO_and_SN>000143-00000793</PCBA_WO_and_SN>
#> <FW_Version>V01.0001</FW_Version>
#> <Model>7000PHW</Model>
#> </content>