xml-tei in R:长度(节点)有一个不同的值

xml-tei in R: length(nodes) with one different value

抱歉,在 R 中解析 xml-tei 的文档还不够,特别是对于 R 的初学者。

我正在计算几个具有函数 'getNodeSet' 的节点,它们在 'contains' 中只有一个不同的值。目的是根据特定的 'contains' 来计算 '@type=verb',它具有所有共同的 'contains(@ana,'#action')'。示例:

#different value "@ana, '#displacement'" 
nodes=getNodeSet(doc,"//ns:w[contains(@type,'verb') and contains(@ana,'#action') and contains(@ana, '#displacement') and contains(@ana, '#ANT')]", ns)  
VARIABLE NAME01 <- length(nodes)
VARIABLE NAME01
#result in the console [2] 

#different value "@ana, '#put_together'" 
nodes=getNodeSet(doc,"//ns:w[contains(@type,'verb') and contains(@ana,'#action') and contains(@ana, '#put_together') and  contains(@ana, '#ANT')]", ns)
VARIABLE NAME02 <- length(nodes) 
VARIABLE NAME02
#result in the console [0]

#different value "@ana, '#destruction'" 
nodes=getNodeSet(doc,"//ns:w[contains(@type,'verb') and contains(@ana,'#action') and contains(@ana, '#destruction') and contains(@ana, '#ANT')]", ns)
VARIABLE NAME03 <- length(nodes)
VARIABLE NAME03
#result in the console [7]

但是每次都写基本相同的东西当然很乏味,也不是很美观。

是否可以有类似的东西(抱歉,编码不正确,只是一个满足我需要的例子):

#a condition
For node=getNodeSet(doc,"//ns:w[contains(@type,'verb') and contains(@ana,'#action') and not(contains(@ana, '#ANT'))]" 
#add in contains
(    
(@ana, ‘DIFFERENT VALUE01') FOR VARIABLE01
(@ana, ‘DIFFERENT VALUE02') FOR VARIABLE02
(@ana, ‘DIFFERENT VALUE03') FOR VARIABLE03
)
#etc.

你有想法吗?

之后,我需要能够添加结果:

add_result <- sum(VARIABLE NAME01, VARIABLE NAME02, VARIABLE NAME03)
add_result

但是后来,我在想:

nodes=sum(
 (getNodeSet(doc,"//ns:w[contains(@ana,'#action') and contains(@type,'verb')]", ns)),
 (getNodeSet(doc,"//ns:w[contains(@type,'verb') and contains(@type,'verb')]", ns))
 )
add_result <- length(nodes) 
add_result

然后我寻找另一个具有不同值的节点。 但遗憾的是,它不起作用。

提前感谢您的建议。

到目前为止我做了什么:

nodes=getNodeSet(doc,"//ns:w[contains(@type,'verb') and contains(@ana,'#action') and contains(@ana, '#displacement') and contains(@ana, '#ANT')]", ns)
nodes=getNodeSet(doc,"//ns:w[contains(@type,'verb') and contains(@ana,'#action') and contains(@ana, '#put_together') and  contains(@ana, '#ANT')]", ns)
VARIABLE NAME01 <- length(nodes)
VARIABLE NAME02 <- length(nodes)

不知道有没有更简单的方法呢

一位同事,'R'专家,帮助我并提出:

typeAction=c("'#displacement'","'#put_together'","'#agression'","'#confrontation'","'#movement'","'#otherAction'")
total_action_ANT=0
for (i in 1:length(typeAction)) total_action_ANT=total_action_ANT+length(getNodeSet(doc,paste0("//ns:w[contains(@type,'verb') and contains(@ana,'#action') and contains(@ana, ",typeAction[i],") and contains(@ana, '#ANT')]"), ns))
total_action_ANT

nodelist=list()
for (i in 1:length(typeAction))nodelist[[i]]=getNodeSet(doc,paste0("//ns:w[contains(@type,'verb') and contains(@ana,'#action') and contains(@ana, ",typeAction[i],") and contains(@ana, '#ANT')]"), ns)
str(nodelist)
resultats = cbind(action=typeAction,occurences=unlist(lapply(nodelist,function(x)length(x))))
resultats

效果很好!希望这会有所帮助。