验证每个主题都具有 class 类型

Validating that every subject has a type of class

我有以下数据和形状图。

@prefix hr: <http://learningsparql.com/ns/humanResources#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix schema: <http://schema.org/> .
@prefix sh: <http://www.w3.org/ns/shacl#> .

hr:Employee a rdfs:Class .
hr:BadThree rdfs:comment "some comment about missing" .
hr:BadTwo a hr:BadOne .
hr:YetAnother a hr:Another .
hr:YetAnotherName a hr:AnotherName .
hr:Another a hr:Employee .
hr:AnotherName a hr:name .
hr:BadOne a hr:Dangling .
hr:name a rdf:Property .

schema:SchemaShape
    a sh:NodeShape ;
    sh:target [
        a sh:SPARQLTarget ;
        sh:prefixes hr: ;
        sh:select """
            SELECT ?this
            WHERE {
                ?this ?p ?o .
            }
            """ ;
    ] ; 

    sh:property [                
        sh:path rdf:type ;
        sh:nodeKind sh:IRI ;
        sh:hasValue rdfs:Class
    ] ; 
.

使用pySHACL:

import rdflib

from pyshacl import validate

full_graph = open( "/Users/jamesh/jigsaw/shacl_work/data_graph.ttl", "r" ).read()

g = rdflib.Graph().parse( data = full_graph, format = 'turtle' )

report = validate( g, inference='rdfs', abort_on_error = False, meta_shacl = False, debug = False )
print( report[2] )

我认为应该发生的是基于 SPARQL 的目标应该 select 数据图中的每个主题,然后验证是否存在 rdf:type 的路径,其值为 rdfs:Class.

我得到以下结果:

Validation Report
Conforms: True

预期的验证错误应仅包括以下主题:

| <http://learningsparql.com/ns/humanResources#BadOne>         |
| <http://learningsparql.com/ns/humanResources#BadTwo>         |
| <http://learningsparql.com/ns/humanResources#BadThree>       |
| <http://learningsparql.com/ns/humanResources#AnotherName>    |
| <http://learningsparql.com/ns/humanResources#name>           |
| <http://learningsparql.com/ns/humanResources#YetAnotherName> |

这对 SHACL 来说可能吗?如果是这样,形状文件应该是什么?

接下来的内容会导致预期的验证错误,但是,还有几件事我不明白。

  1. 不需要sh:prefixes hr: ;designed 为 SPARQL 目标 SELECT 语句本身提供前缀,仅此而已。

  2. Inference 需要禁用。它正在插入三元组并尝试验证它们。在这个用例中,这不是我们想要的。应该验证的是模式中的内容,而不是其他内容。

  3. 我也在想,基于显然是对 https://github.com/RDFLib/pySHACL/issues/46 的误解,将所有内容都放在一个图表中并不是问题。

graph_data = """
@prefix hr: <http://learningsparql.com/ns/humanResources#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix schema: <http://schema.org/> .
@prefix sh: <http://www.w3.org/ns/shacl#> .

hr:Employee a rdfs:Class .
hr:BadThree rdfs:comment "some comment about missing" .
hr:BadTwo a hr:BadOne .
hr:YetAnother a hr:Another .
hr:YetAnotherName a hr:AnotherName .
hr:Another a hr:Employee .
hr:AnotherName a hr:name .
hr:BadOne a hr:Dangling .
hr:name a rdf:Property .
"""

shape_data = '''
@prefix hr: <http://learningsparql.com/ns/humanResources#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix schema: <http://schema.org/> .
@prefix sh: <http://www.w3.org/ns/shacl#> .

schema:SchemaShape
    a sh:NodeShape ;
    sh:target [
        a sh:SPARQLTarget ;
        sh:prefixes hr: ;
        sh:select """
            SELECT ?this
            WHERE {
                ?this ?p ?o .
            }
            """ ;
    ] ; 

    sh:property [                
        sh:path ( rdf:type [ sh:zeroOrMorePath rdf:type ] ) ;
        sh:nodeKind sh:IRI ;
        sh:hasValue rdfs:Class
    ] ; 
.
'''

data  = rdflib.Graph().parse( data = graph_data, format = 'turtle' )
shape = rdflib.Graph().parse( data = shape_data, format = 'turtle' )

report = validate( data, shacl_graph=shape, abort_on_error = False, meta_shacl = False, debug = False, advanced = True )

使用基于 SPARQL 的约束的替代方案如下所示:

graph_data = """
@prefix hr: <http://learningsparql.com/ns/humanResources#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix schema: <http://schema.org/> .
@prefix sh: <http://www.w3.org/ns/shacl#> .

hr:Employee a rdfs:Class .
hr:BadThree rdfs:comment "some comment about missing" .
hr:BadTwo a hr:BadOne .
hr:YetAnother a hr:Another .
hr:YetAnotherName a hr:AnotherName .
hr:Another a hr:Employee .
hr:AnotherName a hr:name .
hr:BadOne a hr:Dangling .
hr:name a rdf:Property .
"""

shape_data = '''
@prefix hr: <http://learningsparql.com/ns/humanResources#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix schema: <http://schema.org/> .
@prefix sh: <http://www.w3.org/ns/shacl#> .

schema:SchemaShape
    a sh:NodeShape ;
    sh:target [
        a sh:SPARQLTarget ;
        sh:select """
            SELECT ?this
            WHERE {
                ?this ?p ?o .
            }
            """ ;
    ] ; 

    sh:sparql [ 
        a sh:SPARQLConstraint ; 
        sh:message "Node does not have type rdfs:Class." ; 
        sh:prefixes hr: ; 
        sh:select """ 
            SELECT $this 
            WHERE { 
                $this rdf:type ?o . 

                FILTER NOT EXISTS {
                    ?o rdf:type* rdfs:Class
                }
                FILTER ( strstarts( str( $this ), str( hr: ) ) ) 
            }
            """ ;
    ]
.
'''


data  = rdflib.Graph().parse( data = graph_data, format = 'turtle' )
shape = rdflib.Graph().parse( data = shape_data, format = 'turtle' )

report = validate( data, shacl_graph=shape, abort_on_error = False, meta_shacl = False, debug = False, advanced = True )