RDFlib 查询不起作用
RDFlib query not working
我编写了一个 Python 脚本,它应该能够 运行 通过 dbpedia URI 的列表并 运行 对其进行查询。但是,出于某种原因,我在
上收到错误消息
qres = g.query(query)
当我运行这段代码。有谁知道为什么会发生这种情况以及我该如何解决这个问题?我真的被困住了,我的论文时间表已经落后了,所以压力真的越来越大了。
代码:
import rdflib
import csv
import pandas as pd
colnames = ['Link']
list2 = pd.read_csv('C:/Users/Frank/Google Drive/Master Scriptie/testtest3.csv', sep=',', header=None, usecols=[2], names=colnames)
saved_column = list2.Link
outputfile = open('C:/Users/Frank/Google Drive/Master Scriptie/code files/dbpedia_output/test_dataset_uri_subject.csv', 'w')
reader = csv.reader(saved_column)
g = rdflib.Graph()
for uri in reader:
uri2 = "".join(str(x) for x in uri)
uri2 = uri2[1:].rstrip()
print (uri2)
result = g.parse("http://dbpedia.org" + uri2)
print (result)
query = "SELECT ?subject WHERE {<http://dbpedia.org" + uri2 + "> dbo:wikiPageRedirects*/dct:subject ?subject .}"
print ("query: " + query)
qres = g.query(query)
for singlerow in qres:
subject_final = "%s" % singlerow
outputfile.write("{0}, {1} \n".format(uri,subject_final)
cmd 中的错误消息:
/resource/Sheldon_J._Plankton
[a rdfg:Graph;rdflib:storage [a rdflib:Store;rdfs:label 'IOMemory']].
query: SELECT ?subject WHERE {<http://dbpedia.org/resource/Sheldon_J._Plankton>
dbo:wikiPageRedirects*/dct:subject ?subject .}
Traceback (most recent call last):
File "rdfimport.py", line 47, in <module>
qres = g.query(query)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\graph.py", line 1089, in query
query_object, initBindings, initNs, **kwargs))
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\processor.py", line 75, in query
query = translateQuery(parsetree, base, initNs)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 764, in translateQuery
q[1], visitPost=functools.partial(translatePName, prologue=prologue))
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 384, in traverse
r = _traverse(tree, visitPre, visitPost)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 345, in _traverse
e[k] = _traverse(val, visitPre, visitPost)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 345, in _traverse
e[k] = _traverse(val, visitPre, visitPost)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in _traverse
return [_traverse(x, visitPre, visitPost) for x in e]
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in <listcomp>
return [_traverse(x, visitPre, visitPost) for x in e]
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 345, in _traverse
e[k] = _traverse(val, visitPre, visitPost)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in _traverse
return [_traverse(x, visitPre, visitPost) for x in e]
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in <listcomp>
return [_traverse(x, visitPre, visitPost) for x in e]
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in _traverse
return [_traverse(x, visitPre, visitPost) for x in e]
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in <listcomp>
return [_traverse(x, visitPre, visitPost) for x in e]
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 345, in _traverse
e[k] = _traverse(val, visitPre, visitPost)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in _traverse
return [_traverse(x, visitPre, visitPost) for x in e]
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in <listcomp>
return [_traverse(x, visitPre, visitPost) for x in e]
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 345, in _traverse
e[k] = _traverse(val, visitPre, visitPost)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in _traverse
return [_traverse(x, visitPre, visitPost) for x in e]
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in <listcomp>
return [_traverse(x, visitPre, visitPost) for x in e]
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 345, in _traverse
e[k] = _traverse(val, visitPre, visitPost)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 347, in _traverse
_e = visitPost(e)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 142, in translatePName
return prologue.absolutize(p)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\sparql.py", line 374, in absolutize
return self.resolvePName(iri.prefix, iri.localname)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\sparql.py", line 357, in resolvePName
raise Exception('Unknown namespace prefix : %s' % prefix)
Exception: Unknown namespace prefix : dct
提前致谢:)
编辑:
我认为
出了点问题
result = g.parse("http://dbpedia.org" + uri2)
在此示例中,它尝试在那里解析的 URI 是“http://dbpedia.org/resource/Sheldon_J._Plankton”
如果我直接将该 URI 放入 g.parse,这也会出错。这可能是因为该 URI 是 "wrong",因为它重定向到
"http://dbpedia.org/resource/Plankton_(character)"。
我在查询中用 dbo:wikiPageRedirects 修复了这个问题,但那当然是在这个解析之后。所以我认为问题就在那里,但是如果我不能先解析它来获取该页面,我怎么能使用 dbo:wikiPageRedirects 获取正确的页面呢??
错误消息抱怨无法识别前缀 dct
,RDFLib 已内置 dcterms
或者您可以绑定自己的前缀:
from rdflib.namespace import DCTERMS, Namespace
g.bind("dct", DCTerms)
g.bind("dbo", Namespace("http://dbpedia.org/ontology/"))
g.bind("dbr", Namespace("http://dbpedia.org/resource/"))
假设 uri2 是一个 dbpedia 资源并且只包含 URI 的最后部分(即 "Sheldon_J._Plankton"),那么获取重定向页面的 SPARQL 查询变为:
q = "SELECT ?subject WHERE {{ dbr:{} dbo:wikiPageRedirects ?subject. }}".format
result = g.query(q(uri2))
for row in result:
print(row.subject)
要获取重定向的主题,如果它在您的数据中,则此查询应该有效。但是您可能需要 运行 g.parse 遍历上一个查询中返回的 URI,以将其添加到您的数据中:
q = "SELECT ?subject WHERE {{ dbr:{} dbo:wikiPageRedirects ?redirect. ?redirect dct:subject ?subject. }}".format
result = q.query(q(uri2))
我编写了一个 Python 脚本,它应该能够 运行 通过 dbpedia URI 的列表并 运行 对其进行查询。但是,出于某种原因,我在
上收到错误消息qres = g.query(query)
当我运行这段代码。有谁知道为什么会发生这种情况以及我该如何解决这个问题?我真的被困住了,我的论文时间表已经落后了,所以压力真的越来越大了。
代码:
import rdflib
import csv
import pandas as pd
colnames = ['Link']
list2 = pd.read_csv('C:/Users/Frank/Google Drive/Master Scriptie/testtest3.csv', sep=',', header=None, usecols=[2], names=colnames)
saved_column = list2.Link
outputfile = open('C:/Users/Frank/Google Drive/Master Scriptie/code files/dbpedia_output/test_dataset_uri_subject.csv', 'w')
reader = csv.reader(saved_column)
g = rdflib.Graph()
for uri in reader:
uri2 = "".join(str(x) for x in uri)
uri2 = uri2[1:].rstrip()
print (uri2)
result = g.parse("http://dbpedia.org" + uri2)
print (result)
query = "SELECT ?subject WHERE {<http://dbpedia.org" + uri2 + "> dbo:wikiPageRedirects*/dct:subject ?subject .}"
print ("query: " + query)
qres = g.query(query)
for singlerow in qres:
subject_final = "%s" % singlerow
outputfile.write("{0}, {1} \n".format(uri,subject_final)
cmd 中的错误消息:
/resource/Sheldon_J._Plankton
[a rdfg:Graph;rdflib:storage [a rdflib:Store;rdfs:label 'IOMemory']].
query: SELECT ?subject WHERE {<http://dbpedia.org/resource/Sheldon_J._Plankton>
dbo:wikiPageRedirects*/dct:subject ?subject .}
Traceback (most recent call last):
File "rdfimport.py", line 47, in <module>
qres = g.query(query)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\graph.py", line 1089, in query
query_object, initBindings, initNs, **kwargs))
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\processor.py", line 75, in query
query = translateQuery(parsetree, base, initNs)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 764, in translateQuery
q[1], visitPost=functools.partial(translatePName, prologue=prologue))
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 384, in traverse
r = _traverse(tree, visitPre, visitPost)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 345, in _traverse
e[k] = _traverse(val, visitPre, visitPost)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 345, in _traverse
e[k] = _traverse(val, visitPre, visitPost)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in _traverse
return [_traverse(x, visitPre, visitPost) for x in e]
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in <listcomp>
return [_traverse(x, visitPre, visitPost) for x in e]
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 345, in _traverse
e[k] = _traverse(val, visitPre, visitPost)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in _traverse
return [_traverse(x, visitPre, visitPost) for x in e]
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in <listcomp>
return [_traverse(x, visitPre, visitPost) for x in e]
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in _traverse
return [_traverse(x, visitPre, visitPost) for x in e]
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in <listcomp>
return [_traverse(x, visitPre, visitPost) for x in e]
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 345, in _traverse
e[k] = _traverse(val, visitPre, visitPost)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in _traverse
return [_traverse(x, visitPre, visitPost) for x in e]
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in <listcomp>
return [_traverse(x, visitPre, visitPost) for x in e]
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 345, in _traverse
e[k] = _traverse(val, visitPre, visitPost)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in _traverse
return [_traverse(x, visitPre, visitPost) for x in e]
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 339, in <listcomp>
return [_traverse(x, visitPre, visitPost) for x in e]
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 345, in _traverse
e[k] = _traverse(val, visitPre, visitPost)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 347, in _traverse
_e = visitPost(e)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\algebra.py", line 142, in translatePName
return prologue.absolutize(p)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\sparql.py", line 374, in absolutize
return self.resolvePName(iri.prefix, iri.localname)
File "C:\Users\Frank\AppData\Local\Programs\Python\Python36-32\lib\site-packag
es\rdflib\plugins\sparql\sparql.py", line 357, in resolvePName
raise Exception('Unknown namespace prefix : %s' % prefix)
Exception: Unknown namespace prefix : dct
提前致谢:)
编辑:
我认为
出了点问题result = g.parse("http://dbpedia.org" + uri2)
在此示例中,它尝试在那里解析的 URI 是“http://dbpedia.org/resource/Sheldon_J._Plankton”
如果我直接将该 URI 放入 g.parse,这也会出错。这可能是因为该 URI 是 "wrong",因为它重定向到
"http://dbpedia.org/resource/Plankton_(character)"。
我在查询中用 dbo:wikiPageRedirects 修复了这个问题,但那当然是在这个解析之后。所以我认为问题就在那里,但是如果我不能先解析它来获取该页面,我怎么能使用 dbo:wikiPageRedirects 获取正确的页面呢??
错误消息抱怨无法识别前缀 dct
,RDFLib 已内置 dcterms
或者您可以绑定自己的前缀:
from rdflib.namespace import DCTERMS, Namespace
g.bind("dct", DCTerms)
g.bind("dbo", Namespace("http://dbpedia.org/ontology/"))
g.bind("dbr", Namespace("http://dbpedia.org/resource/"))
假设 uri2 是一个 dbpedia 资源并且只包含 URI 的最后部分(即 "Sheldon_J._Plankton"),那么获取重定向页面的 SPARQL 查询变为:
q = "SELECT ?subject WHERE {{ dbr:{} dbo:wikiPageRedirects ?subject. }}".format
result = g.query(q(uri2))
for row in result:
print(row.subject)
要获取重定向的主题,如果它在您的数据中,则此查询应该有效。但是您可能需要 运行 g.parse 遍历上一个查询中返回的 URI,以将其添加到您的数据中:
q = "SELECT ?subject WHERE {{ dbr:{} dbo:wikiPageRedirects ?redirect. ?redirect dct:subject ?subject. }}".format
result = q.query(q(uri2))