获取 relYear 或在 dbpedia 单曲数据中发布而不记录重复
get relYear or released in dbpedia singles data without record duplication
Here 是以下查询的浏览输出,用于尝试获取广告牌前 100 名的单曲:
PREFIX prop: <http://dbpedia.org/property/>
PREFIX ont: <http://dbpedia.org/ontology/>
SELECT DISTINCT ?page, ?artist, ?relYear, ?released, ?runTime WHERE {
?page dct:subject dbc:Billboard_Hot_100_number-one_singles .
OPTIONAL {?page prop:artist ?artist}.
OPTIONAL {?page prop:relyear ?relYear}.
OPTIONAL {?page prop:released ?released}.
OPTIONAL {?page ont:runtime ?runTime}
}
我放了 relYear 并发布了,因为有些单曲有一个,有些有另一个,有些两个都有,有些两个都没有。
如果您查看输出,它会重复行:一行是 relYear,第二行是 released。我想要 SQL COALESCE(released, relYear)
之类的东西,即将存在的 (released, relYear) 的第一个元素放在一行中。
我该怎么做?
P.S。我对 artist 和 musicalArtist 等有同样的问题,所以行最终疯狂地增加。
P.P.S。看了this,但没有帮助。
基本上你已经知道答案了,用COALESCE
:
PREFIX prop: <http://dbpedia.org/property/>
PREFIX ont: <http://dbpedia.org/ontology/>
SELECT DISTINCT ?page ?artist (coalesce(?relYear, ?released) as ?releaseYear) ?runTime WHERE {
?page dct:subject dbc:Billboard_Hot_100_number-one_singles .
OPTIONAL {?page prop:artist ?artist}.
OPTIONAL {?page prop:relyear ?relYear}.
OPTIONAL {?page prop:released ?released}.
OPTIONAL {?page ont:runtime ?runTime}
}
但是,我想您已经知道,同一首单曲可以有多个版本,由不同的艺术家完成并以不同的版本完成,因此,您可能会为同一首单曲获得多行。作为一个极端的例子,看看 Total_Eclipse_of_the_Heart
+------------------------------+--------------------------------+-------------+---------+
| page | artist | releaseYear | runTime |
+------------------------------+--------------------------------+-------------+---------+
| :Total_Eclipse_of_the_Heart | "Bonnie Tyler"^^rdf:langString | 1983 | 180.0 |
| :Total_Eclipse_of_the_Heart | "Nicki French"^^rdf:langString | 1983 | 180.0 |
| :Total_Eclipse_of_the_Heart | "Bonnie Tyler"^^rdf:langString | 1995 | 180.0 |
| :Total_Eclipse_of_the_Heart | "Nicki French"^^rdf:langString | 1995 | 180.0 |
| :Total_Eclipse_of_the_Heart | "Bonnie Tyler"^^rdf:langString | 2012 | 180.0 |
| :Total_Eclipse_of_the_Heart | "Nicki French"^^rdf:langString | 2012 | 180.0 |
| :Total_Eclipse_of_the_Heart | "Bonnie Tyler"^^rdf:langString | 1983 | 230.0 |
| :Total_Eclipse_of_the_Heart | "Nicki French"^^rdf:langString | 1983 | 230.0 |
| :Total_Eclipse_of_the_Heart | "Bonnie Tyler"^^rdf:langString | 1995 | 230.0 |
| :Total_Eclipse_of_the_Heart | "Nicki French"^^rdf:langString | 1995 | 230.0 |
| :Total_Eclipse_of_the_Heart | "Bonnie Tyler"^^rdf:langString | 2012 | 230.0 |
| :Total_Eclipse_of_the_Heart | "Nicki French"^^rdf:langString | 2012 | 230.0 |
| ... | ... | ... | ... |
+------------------------------+--------------------------------+-------------+---------+
你可以做一些类似group by
here, to get at least for each artist only one row, e.g. in combination with group_concat
or sample
的事情
Here 是以下查询的浏览输出,用于尝试获取广告牌前 100 名的单曲:
PREFIX prop: <http://dbpedia.org/property/>
PREFIX ont: <http://dbpedia.org/ontology/>
SELECT DISTINCT ?page, ?artist, ?relYear, ?released, ?runTime WHERE {
?page dct:subject dbc:Billboard_Hot_100_number-one_singles .
OPTIONAL {?page prop:artist ?artist}.
OPTIONAL {?page prop:relyear ?relYear}.
OPTIONAL {?page prop:released ?released}.
OPTIONAL {?page ont:runtime ?runTime}
}
我放了 relYear 并发布了,因为有些单曲有一个,有些有另一个,有些两个都有,有些两个都没有。
如果您查看输出,它会重复行:一行是 relYear,第二行是 released。我想要 SQL COALESCE(released, relYear)
之类的东西,即将存在的 (released, relYear) 的第一个元素放在一行中。
我该怎么做?
P.S。我对 artist 和 musicalArtist 等有同样的问题,所以行最终疯狂地增加。
P.P.S。看了this,但没有帮助。
基本上你已经知道答案了,用COALESCE
:
PREFIX prop: <http://dbpedia.org/property/>
PREFIX ont: <http://dbpedia.org/ontology/>
SELECT DISTINCT ?page ?artist (coalesce(?relYear, ?released) as ?releaseYear) ?runTime WHERE {
?page dct:subject dbc:Billboard_Hot_100_number-one_singles .
OPTIONAL {?page prop:artist ?artist}.
OPTIONAL {?page prop:relyear ?relYear}.
OPTIONAL {?page prop:released ?released}.
OPTIONAL {?page ont:runtime ?runTime}
}
但是,我想您已经知道,同一首单曲可以有多个版本,由不同的艺术家完成并以不同的版本完成,因此,您可能会为同一首单曲获得多行。作为一个极端的例子,看看 Total_Eclipse_of_the_Heart
+------------------------------+--------------------------------+-------------+---------+
| page | artist | releaseYear | runTime |
+------------------------------+--------------------------------+-------------+---------+
| :Total_Eclipse_of_the_Heart | "Bonnie Tyler"^^rdf:langString | 1983 | 180.0 |
| :Total_Eclipse_of_the_Heart | "Nicki French"^^rdf:langString | 1983 | 180.0 |
| :Total_Eclipse_of_the_Heart | "Bonnie Tyler"^^rdf:langString | 1995 | 180.0 |
| :Total_Eclipse_of_the_Heart | "Nicki French"^^rdf:langString | 1995 | 180.0 |
| :Total_Eclipse_of_the_Heart | "Bonnie Tyler"^^rdf:langString | 2012 | 180.0 |
| :Total_Eclipse_of_the_Heart | "Nicki French"^^rdf:langString | 2012 | 180.0 |
| :Total_Eclipse_of_the_Heart | "Bonnie Tyler"^^rdf:langString | 1983 | 230.0 |
| :Total_Eclipse_of_the_Heart | "Nicki French"^^rdf:langString | 1983 | 230.0 |
| :Total_Eclipse_of_the_Heart | "Bonnie Tyler"^^rdf:langString | 1995 | 230.0 |
| :Total_Eclipse_of_the_Heart | "Nicki French"^^rdf:langString | 1995 | 230.0 |
| :Total_Eclipse_of_the_Heart | "Bonnie Tyler"^^rdf:langString | 2012 | 230.0 |
| :Total_Eclipse_of_the_Heart | "Nicki French"^^rdf:langString | 2012 | 230.0 |
| ... | ... | ... | ... |
+------------------------------+--------------------------------+-------------+---------+
你可以做一些类似group by
here, to get at least for each artist only one row, e.g. in combination with group_concat
or sample