Select DBpedia 资源在摘要中至少出现了 N 次所选单词？

Question

我有这个请求，结果是一些 DBpedia 资源及其摘要。如何筛选结果以仅获取其摘要至少包含特定单词出现次数的资源？

PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dbpedia-owl:<http://www.dbpedial.org/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> 
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

select distinct ?resource ?url ?resume where {
   ?resource rdfs:label ?Nom.
   ?resource foaf:isPrimaryTopicOf ?url.
   ?resource dbo:abstract ?resume.
   FILTER langMatches( lang(?Nom), "EN" )
   FILTER langMatches( lang(?resume), "EN" )
   ?Nom <bif:contains> "apple".             
}

这是没有绑定功能的新请求：

select (strlen(replace(replace(Lcase(?resume), 'Jobs', '_'),'[^_]', '')) as ?nbr )  ?resource ?url 
where {
?resource rdfs:label ?Nom.
   ?resource foaf:isPrimaryTopicOf ?url.
   ?resource dbo:abstract ?resume.
FILTER langMatches( lang(?Nom), "EN" )    
FILTER langMatches( lang(?resume), "EN" )
?Nom <bif:contains> "Apple".}
GROUP BY ?Nom
Having(?nbr >= 1)

Answer 1

这不会是绝对完美的，但对于您要实现的目标而言，它应该可以相对较好地工作。您可以使用 replace 将您要计算的单词的所有实例替换为某个单个字符（例如，“_”）。然后您可以再次使用 replace 将除了该字符之外的所有内容替换为空字符串。然后，您有一个类似于“______”的字符串，其中长度是该单词在字符串中出现的次数。例如，这是一个在摘要中计算 'the' 的查询，并且只保留那些 'the' 出现至少五次的查询。

select ?x ?nThe {
  values ?x { dbr:Horse dbr:Cat dbr:Dog }
  ?x dbo:abstract ?abs 
  filter langMatches(lang(?abs),'en')
  bind(strlen(replace(replace(?abs, '\sthe\s', '_'),'[^_]', '')) as ?nThe)
  filter (?nThe >= 5)
}

SPARQL results

Answer 2

没关系，我找到了另一个表格来满足我的要求：

PREFIX rdfs:   <http://www.w3.org/2000/01/rdf-schema#>
prefix foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dbo:     <http://dbpedia.org/ontology/> 
select distinct ?Nom ?resource ?url 
where {
   ?resource rdfs:label ?Nom.
   ?resource foaf:isPrimaryTopicOf ?url.
   ?resource dbo:abstract ?resume.
FILTER langMatches( lang(?Nom), "EN" )    
FILTER langMatches( lang(?resume), "EN" )
?Nom <bif:contains> "Apple".
FIlTER regex(?resume,"Jobs")}

想所有帮助过我的人

Select DBpedia 资源在摘要中至少出现了 N 次所选单词？

Select DBpedia resource with at least N occurrences of seleted word in abstract?

rdf

sparql

dbpedia