在哪里可以找到已经用 dbpedia 属性 概念注释的文字数据的数据集(其范围为 float 或 int)?

Where to find a dataset with literal data already annotated with dbpedia property concepts (having their range in float or int)?

我正在从事一个项目,试图将 DBpedia 概念映射到 table 数据列。具体来说,我想映射文字(数值;浮点数,整数..)。因此,我需要足够数量的数据来构建背景知识库。我从 T2D-golden-dataset 中提取了一些数据作为本说明末尾的给定格式。实际上,我应该将它们用作测试基准,它只包含总共 table 中的不到 20 列。谁能帮我找到这样一个字面值和 dbpedia 注释的数据集?

文字值 dbpedia 范围;

"http://www.w3.org/2001/XMLSchema#float"
"http://www.w3.org/2001/XMLSchema#integer"
"http://www.w3.org/2001/XMLSchema#positiveInteger"
"http://www.w3.org/2001/XMLSchema#integer"

一些属性具有这些范围;

"http://dbpedia.org/ontology/speaker",
"http://dbpedia.org/ontology/ranking",
"http://dbpedia.org/ontology/humanDevelopmentIndex",
"http://dbpedia.org/ontology/numberOfPlatformLevels",
"http://dbpedia.org/ontology/enginePower",
"http://dbpedia.org/ontology/graySubject",
"http://dbpedia.org/ontology/shareOfAudience",
"http://dbpedia.org/ontology/percentageLiteracyWomen",.........

我需要找到或以某种方式生成的示例是一个与上面给出的概念相对应的数组。例如;

 "http://dbpedia.org/ontology/enginePower" : ["220", "125", "1300",....],
 "http://dbpedia.org/ontology/humanDevelopmentIndex" : ["0.34", "0.78", "0.98", ...]

我不需要那种确切的格式。如果我能为 dbpedia.

找到足够数量的数据 tables T2D golden dataset 那就太好了

This query starts you down the road, as it gets you 100 typed literal values for <http://dbpedia.org/ontology/populationTotal>, 都输入为 <http://www.w3.org/2001/XMLSchema#nonNegativeInteger> --

PREFIX  dbo:  <http://dbpedia.org/ontology/>

SELECT DISTINCT ?value
WHERE 
  { ?subject dbo:populationTotal ?value } 
LIMIT 100

This rather more complex (and expensive) query gets you something like the end result I think you want -- but you will need to run it a number of times, for a few predicates at a time, to get everything you're asking for from the public endpoint. If needed, you could spin up your own DBpedia mirror instance in the AWS cloud,并调整 Virtuoso 的超时和其他限制,以便构建和 运行 一个可以提供一个巨大结果集的查询。

PREFIX  xsd:  <http://www.w3.org/2001/XMLSchema#>
PREFIX  dbo:  <http://dbpedia.org/ontology/>

SELECT # DISTINCT ?predicate ?value ?value_type ?value_str
                  ?predicate ?value_type ( GROUP_CONCAT ( DISTINCT ?value_str ; separator=", " ) AS ?values )
WHERE 
  { ?subject  ?predicate  ?value 
    VALUES ( ?predicate ) { ( dbo:numberOfPlatformLevels )
                            ( dbo:shareOfAudience )
                            ( dbo:populationTotal ) 
                          }
      BIND ( DATATYPE ( ?value ) AS ?value_type )
      BIND (      STR ( ?value ) AS ?value_str )
  } 
GROUP BY ?predicate ?value_type
ORDER BY ?predicate ?value_type
LIMIT 1000