如何将结构化 XML 数据加载到 R 中?

How to load structured XML data into R?

我有一个非平面 XML 数据库,我想将其转换为 R 数据框。我解析如下:

page <- xmlParse("<dataset>
  <language>
    <name>Old_Irish</name>
    <definite>
    <definite_source>Demonstrative</definite_source>
    <definite_article>1</definite_article>
    </definite>
    <n_cases>5</n_cases>
  </language>
  <language>
    <name>Irish</name>
    <definite>
    <definite_source>Demonstrative</definite_source>
    <definite_article>1</definite_article>
    </definite>
    <n_cases>4</n_cases>
  </language>
</dataset>")

然后我将其转换为数据框如下:

xmlToDataFrame(page, stringsAsFactors = FALSE,) %>% 
                        mutate_all(~type.convert(., as.is = T))

这是结果:

       name       definite n_cases
1 Old_Irish Demonstrative1       5
2     Irish Demonstrative1       4

但我想要的是:

       name  definite_source     definite_article n_cases
1 Old_Irish   Demonstrative              1          5
2     Irish   Demonstrative              1          4

如何创建嵌套在 <definite>...</definite> 中的列?

考虑绑定两级节点,假设每个language下的节点不超过一个definite:

df <- cbind.data.frame(
         xmlToDataFrame(nodes=getNodeSet(page,"//language")),
         xmlToDataFrame(nodes=getNodeSet(page,"//definite"))
      )