简化 SPARQL 查询

Simplify SPARQL query

我正在尝试使用 SPARQL 查询对 DBPedia 进行相当复杂的调用。我想获取有关城市(区、联邦州/»Bundesland«、邮政编码、坐标和地理相关城市)的一些信息。

Try online!

SELECT * WHERE {
  #input
  ?x rdfs:label "Bentzin"@de.

  #district
  OPTIONAL {
    ?x dbpedia-owl:district ?district
    # ?x dbpprop:landkreis ?district
    { SELECT * WHERE {
       ?district rdfs:label ?districtName
       FILTER(lang(?districtName) = "de")

       ?district dbpprop:capital ?districtCapital
       { SELECT * WHERE {
         ?districtCapital rdfs:label ?districtCapitalName
         FILTER(lang(?districtCapitalName) = "de")
       }}
    }}
  }

  #federal state
  OPTIONAL {
    # ?x dbpprop:bundesland ?land
    ?x dbpedia-owl:federalState ?land
    { SELECT * WHERE {
        ?land rdfs:label ?landName
        FILTER(lang(?landName) = "de")
    }}
  }

  #postal codes
  ?x dbpedia-owl:postalCode ?zip.

  #coordinates
  ?x geo:lat ?lat.
  ?x geo:long ?long

  #cities in the south
  OPTIONAL {
    ?x dbpprop:south ?south
    {SELECT * WHERE {
      ?south rdfs:label ?southName
      FILTER(lang(?southName) = "de")
    }}
  }

  #cities in the north
  OPTIONAL {
    ?x dbpprop:north ?north
    { SELECT * WHERE {
       ?north rdfs:label ?northName
       FILTER(lang(?northName) = "de")
    }}
  }

  #cities in the west
  ...

}

这在某些情况下有效,但是存在一些主要问题。

  1. 有几个不同的属性可能包含联邦州或地区的值。有时它是 dbpprop:landkreis(地区的德语单词,在其他情况下它是 dbpedia-owl:district。在只设置其中一个的情况下是否可以将这两者结合起来?

  2. 另外,我想念一下北方、西北、……的城市名。有时,这些城市在dbpprop:north等中被引用。每个方向的基本查询是相同的:

    OPTIONAL {
      ?x dbpprop:north ?north
      { SELECT * WHERE {
        ?north rdfs:label ?northName
        FILTER(lang(?northName) = "de")
      }}
    }
    

    我真的不想每个方向都重复八次,有什么办法可以简化吗?

  3. 有时,会引用多个其他城市 (example)。在这些情况下,会返回多个数据集。是否有可能在单个数据集中获取这些城市的名称列表?

    +---+---+---------------------------------------------------------------+
    | x | … |                            southName                          |
    +---+---+---------------------------------------------------------------+
    | … | … | "Darmstadt"@de, "Stuttgart"@de, "Karlsruhe"@de, "Mannheim"@de |
    +---+---+---------------------------------------------------------------+
    

非常感谢您的反馈和想法!

直到

There are several different properties that may contain the value for the federal state or district. Sometimes it’s dbpprop:landkreis (the german word for district, in other cases it’s dbpedia-owl:district. Is it possible to combine those two in cases where only one of them is set?

SPARQL 属性 路径非常适合这个。你可以直接说

?subject dbprop:landkreis|dbpedia-owl:district ?district

如果有更多属性,您可能更喜欢具有 values:

的版本
values ?districtProperty { dbprop:landkreis dbpedia-owl:district }
?subject ?districtProperty ?district

Further, I’d like to read out the names of the cities in the north, northwest, …. Sometimes, these cities are referenced in dbpprop:north etc. The basic query for each direction is the same:

OPTIONAL {
  ?x dbpprop:north ?north
  { SELECT * WHERE {
    ?north rdfs:label ?northName
    FILTER(lang(?northName) = "de")
  }}
}

同样, 可以提供帮助。另外,不要使用 lang(…) = … 来过滤语言,使用 langMatches:

optional {
  values ?directionProp { dbpprop:north
                          #-- ...
                          dbpprop:south }
  ?subject ?directionProp ?direction 
  optional { 
    ?direction rdfs:label ?directionLabel
    filter langMatches(lang(?directionLabel),"de")
  }
}

Sometimes, there are multiple other cities referenced (example). In those cases, there are multiple datasets returned. Is there any possibility to get a list of the names of those cities in a single dataset instead?

+---+---+---------------------------------------------------------------+
| x | … |                            southName                          |
+---+---+---------------------------------------------------------------+
| … | … | "Darmstadt"@de, "Stuttgart"@de, "Karlsruhe"@de, "Mannheim"@de |
+---+---+---------------------------------------------------------------+

这就是 group bygroup_concat 的用途。参见 Aggregating results from SPARQL query。我实际上并没有在您提供的查询中看到这些结果,所以我没有好的数据来测试结果。

您似乎还做了很多不必要的子选择。您可以在图形模式中添加额外的三元组;您不需要嵌套查询来获取更多信息。

考虑到这些因素,您的查询变为:

select * where {
  ?x rdfs:label "Bentzin"@de ;
     dbpedia-owl:postalCode ?zip ;
     geo:lat ?lat ;
     geo:long ?long

  #-- district
  optional {
    ?x dbpedia-owl:district|dbpprop:landkreis ?district .
    ?district rdfs:label ?districtName
    filter langMatches(lang(?districtName),"de")
    optional {
      ?district dbpprop:capital ?districtCapital .
      ?districtCapital rdfs:label ?districtCapitalName
      filter langMatches(lang(?districtCapitalName),"de")
    }
  }

  #federal state
  optional  {
    ?x dbpprop:bundesland|dbpedia-owl:federalState ?land .
    ?land rdfs:label ?landName
    filter langMatches(lang(?landName),"de")
  }

  values ?directionProp { dbpprop:south dbpprop:north }
  optional {
    ?x ?directionProp ?directionPlace .
    ?directionPlace rdfs:label ?directionName 
    filter langMatches(lang(?directionName),"de")
  }
}

SPARQL results

现在,如果您只是寻找这些东西的 名称 ,没有关联的 URI,您实际上可以使用 属性 路径来缩短很多检索标签的结果。例如:

select * where {
  ?x rdfs:label "Bentzin"@de ;
     dbpedia-owl:postalCode ?zip ;
     geo:lat ?lat ;
     geo:long ?long

  #-- district
  optional {
    ?x (dbpedia-owl:district|dbpprop:landkreis)/rdfs:label ?districtName
    filter langMatches(lang(?districtName),"de")
    optional {
      ?district dbpprop:capital/rdfs:label ?districtCapitalName
      filter langMatches(lang(?districtCapitalName),"de")
    }
  }

  #-- federal state
  optional  {
    ?x (dbpprop:bundesland|dbpedia-owl:federalState)/rdfs:label ?landName
    filter langMatches(lang(?landName),"de")
  }

  optional {
    values ?directionProp { dbpprop:south dbpprop:north }
    ?x ?directionProp ?directionPlace .
    ?directionPlace rdfs:label ?directionName
    filter langMatches(lang(?directionName),"de")
  }
}

SPARQL results