如何将结构化 XML 数据加载到 R 中?
How to load structured XML data into R?
我有一个非平面 XML 数据库,我想将其转换为 R 数据框。我解析如下:
page <- xmlParse("<dataset>
<language>
<name>Old_Irish</name>
<definite>
<definite_source>Demonstrative</definite_source>
<definite_article>1</definite_article>
</definite>
<n_cases>5</n_cases>
</language>
<language>
<name>Irish</name>
<definite>
<definite_source>Demonstrative</definite_source>
<definite_article>1</definite_article>
</definite>
<n_cases>4</n_cases>
</language>
</dataset>")
然后我将其转换为数据框如下:
xmlToDataFrame(page, stringsAsFactors = FALSE,) %>%
mutate_all(~type.convert(., as.is = T))
这是结果:
name definite n_cases
1 Old_Irish Demonstrative1 5
2 Irish Demonstrative1 4
但我想要的是:
name definite_source definite_article n_cases
1 Old_Irish Demonstrative 1 5
2 Irish Demonstrative 1 4
如何创建嵌套在 <definite>...</definite>
中的列?
考虑绑定两级节点,假设每个language
下的节点不超过一个definite
:
df <- cbind.data.frame(
xmlToDataFrame(nodes=getNodeSet(page,"//language")),
xmlToDataFrame(nodes=getNodeSet(page,"//definite"))
)
我有一个非平面 XML 数据库,我想将其转换为 R 数据框。我解析如下:
page <- xmlParse("<dataset>
<language>
<name>Old_Irish</name>
<definite>
<definite_source>Demonstrative</definite_source>
<definite_article>1</definite_article>
</definite>
<n_cases>5</n_cases>
</language>
<language>
<name>Irish</name>
<definite>
<definite_source>Demonstrative</definite_source>
<definite_article>1</definite_article>
</definite>
<n_cases>4</n_cases>
</language>
</dataset>")
然后我将其转换为数据框如下:
xmlToDataFrame(page, stringsAsFactors = FALSE,) %>%
mutate_all(~type.convert(., as.is = T))
这是结果:
name definite n_cases
1 Old_Irish Demonstrative1 5
2 Irish Demonstrative1 4
但我想要的是:
name definite_source definite_article n_cases
1 Old_Irish Demonstrative 1 5
2 Irish Demonstrative 1 4
如何创建嵌套在 <definite>...</definite>
中的列?
考虑绑定两级节点,假设每个language
下的节点不超过一个definite
:
df <- cbind.data.frame(
xmlToDataFrame(nodes=getNodeSet(page,"//language")),
xmlToDataFrame(nodes=getNodeSet(page,"//definite"))
)