Validate a Data Graph
Data graphs are validated by running a SPARQL query that lists the data graphs to validate and the shapes graphs to validate the data against. Depending on the type of query that you run, Graph Lakehouse returns tabular results or a validation graph that uses SHACL Validation Report Vocabulary to report on any conformance and constraint violations. This topic describes the validation query syntax and includes examples.
Validation Query Syntax
There are two modes in which you can run a validation query: query mode and report mode. In query mode, tabular results are returned. If 0 results are returned in query mode, that means the data graphs conform to the shapes graphs. In report mode, results are inserted into a specified graph, and you query the graph to review the validation results. The syntax for each mode is described below.
Query Mode
PREFIX sh: <http://www.w3.org/ns/shacl#> PREFIX ... USING <shapes_graph_uri> [ <shapes_graph2_uri> ] [ ... ] VALIDATE <data_graph_uri> [ <data_graph2_uri> ] [ ... ]
For example:
PREFIX sh: <http://www.w3.org/ns/shacl#> PREFIX azg: <http://anzograph.com/> USING azg:personShapes VALIDATE azg:personData
Report Mode
PREFIX sh: <http://www.w3.org/ns/shacl#> PREFIX ... USING <shapes_graph_uri> [ <shapes_graph2_uri> ] [ ... ] VALIDATE <data_graph_uri> [ <data_graph2_uri> ] [ ... ] CREATE REPORT GRAPH <report_graph_uri>
For example:
PREFIX sh: <http://www.w3.org/ns/shacl#> PREFIX azg: <http://anzograph.com/> USING azg:personShapes VALIDATE azg:personData CREATE REPORT GRAPH azg:personReport
Validation Examples
Sample Data Graph
The examples below validate the data graph that is defined in the following INSERT query. The data is validated against the shapes graph example in Create a Shapes Graph.
# employee-data.rq PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX sh: <http://www.w3.org/ns/shacl#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX ex: <http://example.org/> INSERT DATA { GRAPH <http://anzograph.com/employeeData> { ex:Employee a rdfs:Class . ex:emp001 a ex:Employee ; ex:hasID "000-12-3456" ; ex:hasTitle "President" ; ex:employeeType "Manager" ; ex:birthYear "1953"^^xsd:integer ; ex:hasSalary "100000"^^xsd:double . ex:emp002 a ex:Employee ; ex:hasID "000-56-3456" ; ex:hasTitle "Foreman" ; ex:employeeType "Worker" ; ex:birthYear "1966"^^xsd:integer ; ex:hasSupervisor ex:emp003 ; ex:hasWage "20.20"^^xsd:double . ex:emp003 a ex:Employee ; ex:hasID "000-77-3232" ; ex:hasTitle "Production Manager" ; ex:employeeType "Manager" ; ex:birthYear "1968"^^xsd:integer ; ex:hasSupervisor ex:emp001 ; ex:hasSalary "4000"^^xsd:double . ex:emp004 a ex:Employee ; ex:hasID "0" ; ex:hasTitle "Fitter" ; ex:employeeType "Worker" ; ex:birthYear "1979"^^xsd:integer ; ex:hasSupervisor ex:emp002 ; ex:hasWage "17.20"^^xsd:double . ex:emp005 a ex:Employee ; ex:hasID "000-99-3492" ; ex:hasTitle "Fitter" ; ex:employeeType "Worker" ; ex:hasSupervisor ex:emp002 ; ex:birthYear "2000"^^xsd:integer ; ex:hasWage "17.60"^^xsd:double . ex:emp006 a ex:Employee ; ex:hasID "000-78-5592" ; ex:hasTitle "Filer" ; ex:employeeType "Intern" ; ex:birthYear "2003"^^xsd:integer ; ex:hasSupervisor ex:emp002 ; ex:hasWage "14.20"^^xsd:double . ex:emp007 a ex:Employee ; ex:hasID "000-77-3232" ; ex:hasTitle "Sales Manager" ; ex:hasTitle "Vice President" ; ex:employeeType "Manager" ; ex:birthYear "1962"^^xsd:integer ; ex:hasSupervisor ex:emp001 ; ex:hasSalary "80000"^^xsd:double . ex:emp008 a ex:Employee ; ex:hasID "000-31-4868" ; ex:hasTitle "Fitter" ; ex:employeeType "Worker" ; ex:birthYear "2008"^^xsd:integer ; ex:hasSupervisor ex:emp002 ; ex:hasWage "15.00"^^xsd:double . ex:emp009 a ex:Employee ; ex:hasID "000-56-3336" ; ex:hasTitle "Fitter" ; ex:employeeType "Contractor" ; ex:birthYear "2001"^^xsd:integer ; ex:hasSupervisor ex:emp002 ; ex:hasWage "15.00"^^xsd:double . } }
Query Mode Example
The following example performs validation on the sample data graph above. The validation is done in query mode, where results are returned in tabular format:
PREFIX sh: <http://www.w3.org/ns/shacl#> PREFIX azg: <http://anzograph.com/> USING azg:employeeShapes VALIDATE azg:employeeData
The results show that there are 5 violations:
focusNode | resultPath | value | constraint | violation | sourceShape | message --------------------------+---------------------------------+--------+------------------------------------------------------------+--------------------------------------+----------------+--------------------------------------------------------- http://example.org/emp003 | http://example.org/hasSalary | 4000 | http://www.w3.org/ns/shacl#MinInclusiveConstraintComponent | http://www.w3.org/ns/shacl#Violation | _:b10737418398 | Salary must be 30,000 or higher http://example.org/emp004 | http://example.org/hasID | 0 | http://www.w3.org/ns/shacl#PatternConstraintComponent | http://www.w3.org/ns/shacl#Violation | _:b15032385677 | Every employee must have an ID that matches the pattern http://example.org/emp006 | http://example.org/employeeType | Intern | http://www.w3.org/ns/shacl#InConstraintComponent | http://www.w3.org/ns/shacl#Violation | _:b6442451086 | Every employee is a manager, worker, or contractor http://example.org/emp006 | http://example.org/hasWage | 14.2 | http://www.w3.org/ns/shacl#MinInclusiveConstraintComponent | http://www.w3.org/ns/shacl#Violation | _:b10737418399 | Wage must be at least 15.00 http://example.org/emp008 | http://example.org/birthYear | 2008 | http://www.w3.org/ns/shacl#MaxInclusiveConstraintComponent | http://www.w3.org/ns/shacl#Violation | _:b6442451087 | Birth year must be 2007 or earlier 5 rows
For each violation, the focusNode
(subject), resultPath
(predicate), value
, constraint
, violation
, sourceShape
, and message
(if one exists for the shape) is shown. In the first row, employee 3 has a salary of $4,000, which violates the MinInclusiveConstraintComponent that says salaries must be at least $30,000. In the second row, employee 4 has an ID value that violates PatternConstraintComponent because it is too short. Rows 3 and 4 show that employee 6 has an invalid employee type and a wage that is too low. And row 5 shows that employee 8 does not meet the age requirement.
Report Mode Example
The following example performs validation on the sample data graph above. The validation is done in report mode, where the results are saved to a graph rather than returned in tabular format:
PREFIX sh: <http://www.w3.org/ns/shacl#> PREFIX azg: <http://anzograph.com/> USING azg:employeeShapes VALIDATE azg:employeeData CREATE REPORT GRAPH azg:employeeReport
When the query is complete, you can query the new graph to view the results. First you can run a simple ASK query to see whether or not the graph conforms to the shapes. For example, the query below asks whether the value of <http://www.w3.org/ns/shacl#conforms>
is t
(true). If the value is f
(false), the ASK query returns false
:
PREFIX sh: <http://www.w3.org/ns/shacl#> ASK FROM <http://anzograph.com/employeeReport> { ?s sh:conforms "t" .}
false
If the data graph does not conform to the shapes graph, you can write additional queries to return information about the violations. For example:
PREFIX sh: <http://www.w3.org/ns/shacl#> SELECT ?focusNode ?resultPath ?value ?constraint ?violation ?sourceShape ?message FROM <http://anzograph.com/employeeReport> WHERE { ?s sh:focusNode ?focusNode ; sh:resultPath ?resultPath ; sh:value ?value ; sh:sourceConstraintComponent ?constraint ; sh:resultSeverity ?violation ; sh:sourceShape ?sourceShape ; sh:resultMessage ?message . } ORDER BY ?focusNode LIMIT 100
The results show that there are 5 violations:
focusNode | resultPath | value | constraint | violation | sourceShape | message --------------------------+---------------------------------+--------+------------------------------------------------------------+--------------------------------------+----------------+--------------------------------------------------------- http://example.org/emp003 | http://example.org/hasSalary | 4000 | http://www.w3.org/ns/shacl#MinInclusiveConstraintComponent | http://www.w3.org/ns/shacl#Violation | _:b10737418398 | Salary must be 30,000 or higher http://example.org/emp004 | http://example.org/hasID | 0 | http://www.w3.org/ns/shacl#PatternConstraintComponent | http://www.w3.org/ns/shacl#Violation | _:b15032385677 | Every employee must have an ID that matches the pattern http://example.org/emp006 | http://example.org/employeeType | Intern | http://www.w3.org/ns/shacl#InConstraintComponent | http://www.w3.org/ns/shacl#Violation | _:b6442451086 | Every employee is a manager, worker, or contractor http://example.org/emp006 | http://example.org/hasWage | 14.2 | http://www.w3.org/ns/shacl#MinInclusiveConstraintComponent | http://www.w3.org/ns/shacl#Violation | _:b10737418399 | Wage must be at least 15.00 http://example.org/emp008 | http://example.org/birthYear | 2008 | http://www.w3.org/ns/shacl#MaxInclusiveConstraintComponent | http://www.w3.org/ns/shacl#Violation | _:b6442451087 | Birth year must be 2007 or earlier 5 rows
For each violation, the focusNode
(subject), resultPath
(predicate), value
, constraint
, violation
, sourceShape
, and message
(if one exists for the shape) is shown. In the first row, employee 3 has a salary of $4,000, which violates the MinInclusiveConstraintComponent that says salaries must be at least $30,000. In the second row, employee 4 has an ID value that violates PatternConstraintComponent because it is too short. Rows 3 and 4 show that employee 6 has an invalid employee type and a wage that is too low. And row 5 shows that employee 8 does not meet the age requirement.