在 RDFLib 中解析 HTML+RDFa
Parsing HTML+RDFa in RDFLib
RDFLib 似乎support 解析 RDFa 数据。在实现一个片段来解析 RDFa 注释的 HTML 页面后,我 运行 遇到了这个问题:
Traceback (most recent call last):
File "/home/zonk/.local/lib/python3.8/site-packages/rdflib/plugin.py", line 107, in get
p = _plugins[(name, kind)]
KeyError: ('html', <class 'rdflib.parser.Parser'>)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "basic-rdfa.py", line 7, in <module>
g.parse("beatles.rdfa.html", format='html')
File "/home/zonk/.local/lib/python3.8/site-packages/rdflib/graph.py", line 1209, in parse
parser = plugin.get(format, Parser)()
File "/home/zonk/.local/lib/python3.8/site-packages/rdflib/plugin.py", line 109, in get
raise PluginException("No plugin registered for (%s, %s)" % (name, kind))
rdflib.plugin.PluginException: No plugin registered for (html, <class 'rdflib.parser.Parser'>)
使用了以下代码段:
from rdflib import Graph, plugin
g = Graph()
g.parse("beatles.rdfa.html", format='html')
for subj, pred, obj in g:
if(subj, pred, obj) not in g:
raise Exception("Zonk!")
print(f"Graph g has {len(g)} statements.")
print(g.serialize(format="turtle"))
具有以下虚拟数据:
<!DOCTYPE html>
<html lang="en">
<head>
<title>John Lennon</title>
</head>
<div vocab="http://schema.org/">
<div typeof="Person">
<link property="rdfa:copy" href="#lennon"/>
<link property="rdfa:copy" href="#band"/>
</div>
<p resource="#lennon" typeof="rdfa:Pattern">
Name: <span property="name">John Lennon</span>
<p>
<div resource="#band" typeof="rdfa:Pattern">
<div property="band" typeof="MusicGroup">
<link property="rdfa:copy" href="#beatles"/>
</div>
</div>
<div resource="#beatles" typeof="rdfa:Pattern">
<p>Band: <span property="name">The Beatles</span></p>
<p>Size: <span property="size">4</span> players</p>
</div>
</div>
</html>
确实 plugin.py
中没有一行记录任何 HTML 数据。在这种情况下,我如何解析 rdfa 注释数据?
提前谢谢你。
抱歉,文档已过时,我们在当前 RDFlib 版本 (6.0.0) 或先前版本 (5.0.0) 中不再支持 RDFa 解析
要获得 RDFa 支持,您必须使用 RDFlib 4.2.2 (https://github.com/RDFLib/rdflib/tree/4.2.2)。
RDFLib 似乎support 解析 RDFa 数据。在实现一个片段来解析 RDFa 注释的 HTML 页面后,我 运行 遇到了这个问题:
Traceback (most recent call last):
File "/home/zonk/.local/lib/python3.8/site-packages/rdflib/plugin.py", line 107, in get
p = _plugins[(name, kind)]
KeyError: ('html', <class 'rdflib.parser.Parser'>)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "basic-rdfa.py", line 7, in <module>
g.parse("beatles.rdfa.html", format='html')
File "/home/zonk/.local/lib/python3.8/site-packages/rdflib/graph.py", line 1209, in parse
parser = plugin.get(format, Parser)()
File "/home/zonk/.local/lib/python3.8/site-packages/rdflib/plugin.py", line 109, in get
raise PluginException("No plugin registered for (%s, %s)" % (name, kind))
rdflib.plugin.PluginException: No plugin registered for (html, <class 'rdflib.parser.Parser'>)
使用了以下代码段:
from rdflib import Graph, plugin
g = Graph()
g.parse("beatles.rdfa.html", format='html')
for subj, pred, obj in g:
if(subj, pred, obj) not in g:
raise Exception("Zonk!")
print(f"Graph g has {len(g)} statements.")
print(g.serialize(format="turtle"))
具有以下虚拟数据:
<!DOCTYPE html>
<html lang="en">
<head>
<title>John Lennon</title>
</head>
<div vocab="http://schema.org/">
<div typeof="Person">
<link property="rdfa:copy" href="#lennon"/>
<link property="rdfa:copy" href="#band"/>
</div>
<p resource="#lennon" typeof="rdfa:Pattern">
Name: <span property="name">John Lennon</span>
<p>
<div resource="#band" typeof="rdfa:Pattern">
<div property="band" typeof="MusicGroup">
<link property="rdfa:copy" href="#beatles"/>
</div>
</div>
<div resource="#beatles" typeof="rdfa:Pattern">
<p>Band: <span property="name">The Beatles</span></p>
<p>Size: <span property="size">4</span> players</p>
</div>
</div>
</html>
确实 plugin.py
中没有一行记录任何 HTML 数据。在这种情况下,我如何解析 rdfa 注释数据?
提前谢谢你。
抱歉,文档已过时,我们在当前 RDFlib 版本 (6.0.0) 或先前版本 (5.0.0) 中不再支持 RDFa 解析
要获得 RDFa 支持,您必须使用 RDFlib 4.2.2 (https://github.com/RDFLib/rdflib/tree/4.2.2)。