SPARQLWrapper 在 sparql.query().convert() 处出现 URLError
URLError with SPARQLWrapper at sparql.query().convert()
我尝试使用一个小 python 脚本来测试我的 SPARQL 请求。但是,只是下一个简单的代码不起作用。
from SPARQLWrapper import SPARQLWrapper, JSON
import rdflib
#connect to the sparql point
sparql = SPARQLWrapper("http://localhost:3030/sparql")
#SPARQL request
sparql.setQuery("""
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rme: <http://www.semanticweb.org/reminer/>
SELECT ?o
WHERE { ?s ?p ?o }
LIMIT 1
""")
sparql.setReturnFormat(JSON)
results = sparql.query().convert()
for result in results["results"]["bindings"]:
print(result["o"]["value"])
我的代码在转换步骤冻结了很长时间,然后给我一个 URLError。
当我停止脚本时,看到下一条消息:
HTTPError Traceback (most recent call last)
<ipython-input-6-2ab63307a418> in <module>()
18 """)
19 sparql.setReturnFormat(JSON)
---> 20 results = sparql.query().convert()
21
22 for result in results["results"]["bindings"]:
/Users/francocy/anaconda/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py in query(self)
533 @rtype: L{QueryResult} instance
534 """
--> 535 return QueryResult(self._query())
536
537 def queryAndConvert(self):
/Users/francocy/anaconda/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py in _query(self)
513 raise EndPointInternalError(e.read())
514 else:
--> 515 raise e
516
517 def query(self):
/Users/francocy/anaconda/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py in _query(self)
503
504 try:
--> 505 response = urlopener(request)
506 return response, self.returnFormat
507 except urllib.error.HTTPError as e:
/Users/francocy/anaconda/lib/python3.4/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
159 else:
160 opener = _opener
--> 161 return opener.open(url, data, timeout)
162
163 def install_opener(opener):
/Users/francocy/anaconda/lib/python3.4/urllib/request.py in open(self, fullurl, data, timeout)
467 for processor in self.process_response.get(protocol, []):
468 meth = getattr(processor, meth_name)
--> 469 response = meth(req, response)
470
471 return response
/Users/francocy/anaconda/lib/python3.4/urllib/request.py in http_response(self, request, response)
577 if not (200 <= code < 300):
578 response = self.parent.error(
--> 579 'http', request, response, code, msg, hdrs)
580
581 return response
/Users/francocy/anaconda/lib/python3.4/urllib/request.py in error(self, proto, *args)
505 if http_err:
506 args = (dict, 'default', 'http_error_default') + orig_args
--> 507 return self._call_chain(*args)
508
509 # XXX probably also want an abstract factory that knows when it makes
/Users/francocy/anaconda/lib/python3.4/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
439 for handler in handlers:
440 func = getattr(handler, meth_name)
--> 441 result = func(*args)
442 if result is not None:
443 return result
/Users/francocy/anaconda/lib/python3.4/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs)
585 class HTTPDefaultErrorHandler(BaseHandler):
586 def http_error_default(self, req, fp, code, msg, hdrs):
--> 587 raise HTTPError(req.full_url, code, msg, hdrs, fp)
588
589 class HTTPRedirectHandler(BaseHandler):
HTTPError: HTTP Error 403: Forbidden
我在 2.7 和 3.4 上遇到了同样的情况。
编辑:我将连接从 Wifi 更改为 Intranet。我的脚本适用于 DBpedia Sparql 端点,但是当我在本地服务器上请求时出现 Http 错误。好像是代理或者访问我本地服务器的问题。
在此先感谢您的帮助。
正如错误告诉你的那样:
Operation timed out
似乎当您 运行 代码 dbpedia.org 无法通过您的连接访问时。
运行 你的代码现在立即 returns 对我来说如下:
http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat
所以在生产中你可能想要捕获那个 URLError
并以某种方式处理它。
问题编辑后更新:
目前 SPARQLWrapper
依赖 urllib2
来执行其请求,因此如果您使用代理,您应该能够使用 urllib2
的 ProxyHandler
喜欢 here:
proxy = urllib2.ProxyHandler({'http': '192.168.x.x'})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
# and then:
results = sparql.query().convert()
如果您尝试通过 python 脚本在您的本地 fuseki 服务器上执行一些 SPARQL 请求,您可能会被一些代理问题打扰。要解决此问题,您可以使用 urllib 的自动检测 属性。
from SPARQLWrapper import SPARQLWrapper, JSON, XML
#import urllib.request module. Don't forget for Python 3.4 the urllib has been split into several different modules.
import urllib.request
#if the arg is empty in ProxyHandler, urllib will find itself your proxy config.
proxy_support = urllib.request.ProxyHandler({})
opener = urllib.request.build_opener(proxy_support)
urllib.request.install_opener(opener)
#connect to the sparql point
sparql = SPARQLWrapper("http://localhost:3030/yourOwnDb/sparql")
#SPARQL request
sparql.setQuery("""
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?o ?p
WHERE { ?s ?p ?o }
LIMIT 1
""")
sparql.setReturnFormat(JSON)
results = sparql.query().convert()
for result in results["results"]["bindings"]:
print(result["o"]["value"])
对于 2019 年来到这里的用户,对于访问维基数据 sparql 端点的错误,维基数据执行严格的 User-Agent 政策,请参阅维基媒体的此存档(感谢佩雷)Wikidata Project chat, which says that applications sending informative headers is indicative of well-behaved non-bot scripts, also see user-agent policy。
根据documentation, we can set user-agent using the agent
instance variable. The user-agent HTTP header is described in MDN web docs。
最后,您可以将 class 对象初始化为,
sparql = SPARQLWrapper("https://query.wikidata.org/sparql", agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11")
希望对您有所帮助!
我尝试使用一个小 python 脚本来测试我的 SPARQL 请求。但是,只是下一个简单的代码不起作用。
from SPARQLWrapper import SPARQLWrapper, JSON
import rdflib
#connect to the sparql point
sparql = SPARQLWrapper("http://localhost:3030/sparql")
#SPARQL request
sparql.setQuery("""
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rme: <http://www.semanticweb.org/reminer/>
SELECT ?o
WHERE { ?s ?p ?o }
LIMIT 1
""")
sparql.setReturnFormat(JSON)
results = sparql.query().convert()
for result in results["results"]["bindings"]:
print(result["o"]["value"])
我的代码在转换步骤冻结了很长时间,然后给我一个 URLError。 当我停止脚本时,看到下一条消息:
HTTPError Traceback (most recent call last)
<ipython-input-6-2ab63307a418> in <module>()
18 """)
19 sparql.setReturnFormat(JSON)
---> 20 results = sparql.query().convert()
21
22 for result in results["results"]["bindings"]:
/Users/francocy/anaconda/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py in query(self)
533 @rtype: L{QueryResult} instance
534 """
--> 535 return QueryResult(self._query())
536
537 def queryAndConvert(self):
/Users/francocy/anaconda/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py in _query(self)
513 raise EndPointInternalError(e.read())
514 else:
--> 515 raise e
516
517 def query(self):
/Users/francocy/anaconda/lib/python3.4/site-packages/SPARQLWrapper/Wrapper.py in _query(self)
503
504 try:
--> 505 response = urlopener(request)
506 return response, self.returnFormat
507 except urllib.error.HTTPError as e:
/Users/francocy/anaconda/lib/python3.4/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
159 else:
160 opener = _opener
--> 161 return opener.open(url, data, timeout)
162
163 def install_opener(opener):
/Users/francocy/anaconda/lib/python3.4/urllib/request.py in open(self, fullurl, data, timeout)
467 for processor in self.process_response.get(protocol, []):
468 meth = getattr(processor, meth_name)
--> 469 response = meth(req, response)
470
471 return response
/Users/francocy/anaconda/lib/python3.4/urllib/request.py in http_response(self, request, response)
577 if not (200 <= code < 300):
578 response = self.parent.error(
--> 579 'http', request, response, code, msg, hdrs)
580
581 return response
/Users/francocy/anaconda/lib/python3.4/urllib/request.py in error(self, proto, *args)
505 if http_err:
506 args = (dict, 'default', 'http_error_default') + orig_args
--> 507 return self._call_chain(*args)
508
509 # XXX probably also want an abstract factory that knows when it makes
/Users/francocy/anaconda/lib/python3.4/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
439 for handler in handlers:
440 func = getattr(handler, meth_name)
--> 441 result = func(*args)
442 if result is not None:
443 return result
/Users/francocy/anaconda/lib/python3.4/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs)
585 class HTTPDefaultErrorHandler(BaseHandler):
586 def http_error_default(self, req, fp, code, msg, hdrs):
--> 587 raise HTTPError(req.full_url, code, msg, hdrs, fp)
588
589 class HTTPRedirectHandler(BaseHandler):
HTTPError: HTTP Error 403: Forbidden
我在 2.7 和 3.4 上遇到了同样的情况。 编辑:我将连接从 Wifi 更改为 Intranet。我的脚本适用于 DBpedia Sparql 端点,但是当我在本地服务器上请求时出现 Http 错误。好像是代理或者访问我本地服务器的问题。
在此先感谢您的帮助。
正如错误告诉你的那样:
Operation timed out
似乎当您 运行 代码 dbpedia.org 无法通过您的连接访问时。
运行 你的代码现在立即 returns 对我来说如下:
http://www.openlinksw.com/schemas/virtrdf#QuadMapFormat
所以在生产中你可能想要捕获那个 URLError
并以某种方式处理它。
问题编辑后更新:
目前 SPARQLWrapper
依赖 urllib2
来执行其请求,因此如果您使用代理,您应该能够使用 urllib2
的 ProxyHandler
喜欢 here:
proxy = urllib2.ProxyHandler({'http': '192.168.x.x'})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
# and then:
results = sparql.query().convert()
如果您尝试通过 python 脚本在您的本地 fuseki 服务器上执行一些 SPARQL 请求,您可能会被一些代理问题打扰。要解决此问题,您可以使用 urllib 的自动检测 属性。
from SPARQLWrapper import SPARQLWrapper, JSON, XML
#import urllib.request module. Don't forget for Python 3.4 the urllib has been split into several different modules.
import urllib.request
#if the arg is empty in ProxyHandler, urllib will find itself your proxy config.
proxy_support = urllib.request.ProxyHandler({})
opener = urllib.request.build_opener(proxy_support)
urllib.request.install_opener(opener)
#connect to the sparql point
sparql = SPARQLWrapper("http://localhost:3030/yourOwnDb/sparql")
#SPARQL request
sparql.setQuery("""
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?o ?p
WHERE { ?s ?p ?o }
LIMIT 1
""")
sparql.setReturnFormat(JSON)
results = sparql.query().convert()
for result in results["results"]["bindings"]:
print(result["o"]["value"])
对于 2019 年来到这里的用户,对于访问维基数据 sparql 端点的错误,维基数据执行严格的 User-Agent 政策,请参阅维基媒体的此存档(感谢佩雷)Wikidata Project chat, which says that applications sending informative headers is indicative of well-behaved non-bot scripts, also see user-agent policy。
根据documentation, we can set user-agent using the agent
instance variable. The user-agent HTTP header is described in MDN web docs。
最后,您可以将 class 对象初始化为,
sparql = SPARQLWrapper("https://query.wikidata.org/sparql", agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11")
希望对您有所帮助!