读取 XML 时出现 R 段错误

R segfaulting when reading XML

我有以下 XML 文件

<conf>
<Constraints>
<BETA>0</BETA>
</Constraints>
</conf>

当我尝试加载这个时 xml

R> library(XML)
R> xmlParse('test.xml')

 *** caught segfault ***
address 0x3a00000000, cause 'memory not mapped'

Traceback:
 1: .Call("RS_XML_ParseTree", as.character(file), handlers, as.logical(ignoreBlanks),     as.logical(replaceEntities), as.logical(asText), as.logical(trim),     as.logical(validate), as.logical(getDTD), as.logical(isURL),     as.logical(addAttributeNamespaces), as.logical(useInternalNodes),     as.
logical(isHTML), as.logical(isSchema), as.logical(fullNamespaceInfo),     as.character(encoding), as.logical(useDotNames), xinclude,     error, addFinalizer, as.integer(options), as.logical(parentFirst),     PACKAGE = "XML")
 2: xmlParse("test.xml")

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection: 

在 R 中加载 xml 文件之前,我需要评估什么吗?

xml 文件的语法似乎是正确的(根据网络 xml 验证器)

我创建了一个新文件并粘贴了那几行,但它还是崩溃了,所以它似乎不是文件格式...

R> sessionInfo()
R version 3.3.0 (2016-05-03)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS release 6.6 (Final)

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] XML_3.98-1.4

我试过了xml2

R> library(xml2)
R> read_xml(
x=         encoding=  ...=       as_html=   options=   n=         verbose=   base_url=  
R> read_xml(x = 'test.xml')

 *** caught segfault ***
address 0x3a00000000, cause 'memory not mapped'

Traceback:
 1: .Call("xml2_doc_parse_file", PACKAGE = "xml2", path, encoding,     as_html, options)
 2: doc_parse_file(con, encoding = encoding, as_html = as_html, options = options)
 3: read_xml.character(x = "test.xml")
 4: read_xml(x = "test.xml")

我的想法是这些包使用的 libXml2 库一定是这里有问题...虽然我不知道如何测试这个

g++ -m64 -I/usr/include/R -DNDEBUG -I/usr/include/libxml2 -I/usr/local/include -I"/pxfs1/home/user/R/x86_64-redhat-linux-gnu-library/3.3/Rcpp/include" -I"/pxfs1/home/user/R/x86_64-redhat-linux-gnu-library/3.3/BH/include"   -fpic  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic  -c RcppExports.cpp -o RcppExports.o

我的库xml2 已经过时了:

$ rpm -qa |grep -i libxml2
libxml2-2.7.6-17.el6_6.1.i686
libxml2-python-2.7.6-17.el6_6.1.x86_64
libxml2-devel-2.7.6-17.el6_6.1.i686
libxml2-2.7.6-17.el6_6.1.x86_64
libxml2-devel-2.7.6-17.el6_6.1.x86_64

更新 libxml2 有效

libxml2-python-2.7.6-21.el6_8.1.x86_64        Fri 04 Nov 2016 10:10:17 AM EDT
libxml2-devel-2.7.6-21.el6_8.1.x86_64         Fri 04 Nov 2016 10:10:17 AM EDT
libxml2-devel-2.7.6-21.el6_8.1.i686           Fri 04 Nov 2016 10:10:16 AM EDT
libxml2-2.7.6-21.el6_8.1.x86_64               Fri 04 Nov 2016 10:10:16 AM EDT
libxml2-2.7.6-21.el6_8.1.i686                 Fri 04 Nov 2016 10:10:16 AM EDT

显然 R XML 包中有一个错误。我们发现在这个包中引入了管理引用的新方法(错误出现在 Linux / OS X,而不是 Windows)

解决方法是在禁用节点 GC 的情况下安装此包

R -e 'install.packages("XML", configure.args=c("--enable-nodegc=no"))