Validate a Data Graph

Data graphs are validated by running a SPARQL query that lists the data graphs to validate and the shapes graphs to validate the data against. Depending on the type of query that you run, Graph Lakehouse returns tabular results or a validation graph that uses SHACL Validation Report Vocabulary to report on any conformance and constraint violations. This topic describes the validation query syntax and includes examples.

Validation Query Syntax

There are two modes in which you can run a validation query: query mode and report mode. In query mode, tabular results are returned. If 0 results are returned in query mode, that means the data graphs conform to the shapes graphs. In report mode, results are inserted into a specified graph, and you query the graph to review the validation results. The syntax for each mode is described below.

Query Mode

PREFIX sh: <http://www.w3.org/ns/shacl#>
PREFIX ...
USING 
  <shapes_graph_uri>
  [ <shapes_graph2_uri> ]
  [ ... ]
VALIDATE
  <data_graph_uri>
  [ <data_graph2_uri> ]
  [ ... ]

For example:

PREFIX sh: <http://www.w3.org/ns/shacl#>
PREFIX azg: <http://anzograph.com/>
USING azg:personShapes
VALIDATE azg:personData

Report Mode

PREFIX sh: <http://www.w3.org/ns/shacl#>
PREFIX ...
USING 
  <shapes_graph_uri>
  [ <shapes_graph2_uri> ]
  [ ... ]
VALIDATE
  <data_graph_uri>
  [ <data_graph2_uri> ]
  [ ... ]
CREATE REPORT GRAPH <report_graph_uri>

For example:

PREFIX sh: <http://www.w3.org/ns/shacl#>
PREFIX azg: <http://anzograph.com/>
USING azg:personShapes
VALIDATE azg:personData
CREATE REPORT GRAPH azg:personReport

Validation Examples

Sample Data Graph

The examples below validate the data graph that is defined in the following INSERT query. The data is validated against the shapes graph example in Create a Shapes Graph.

# employee-data.rq
PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX sh:   <http://www.w3.org/ns/shacl#>
PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#>
PREFIX ex:   <http://example.org/>

INSERT DATA { 
  GRAPH <http://anzograph.com/employeeData>
  {
    ex:Employee
      a rdfs:Class .
    ex:emp001
      a ex:Employee ;
      ex:hasID "000-12-3456" ;
      ex:hasTitle "President" ;
      ex:employeeType "Manager" ;
      ex:birthYear "1953"^^xsd:integer ;
      ex:hasSalary "100000"^^xsd:double .
    ex:emp002
      a ex:Employee ;
      ex:hasID "000-56-3456" ;
      ex:hasTitle "Foreman" ;
      ex:employeeType "Worker" ;
      ex:birthYear "1966"^^xsd:integer ;
      ex:hasSupervisor ex:emp003 ;
      ex:hasWage "20.20"^^xsd:double .
    ex:emp003
      a ex:Employee ;
      ex:hasID "000-77-3232" ;
      ex:hasTitle "Production Manager" ;
      ex:employeeType "Manager" ;
      ex:birthYear "1968"^^xsd:integer ;
      ex:hasSupervisor ex:emp001 ;
      ex:hasSalary "4000"^^xsd:double .
    ex:emp004
      a ex:Employee ;
      ex:hasID "0" ;
      ex:hasTitle "Fitter" ;
      ex:employeeType "Worker" ;
      ex:birthYear "1979"^^xsd:integer ;
      ex:hasSupervisor ex:emp002 ;
      ex:hasWage "17.20"^^xsd:double .
    ex:emp005
      a ex:Employee ;
      ex:hasID "000-99-3492" ;
      ex:hasTitle "Fitter" ;
      ex:employeeType "Worker" ;
      ex:hasSupervisor ex:emp002 ;
      ex:birthYear "2000"^^xsd:integer ;
      ex:hasWage "17.60"^^xsd:double .
    ex:emp006
      a ex:Employee ;
      ex:hasID "000-78-5592" ;
      ex:hasTitle "Filer" ;
      ex:employeeType "Intern" ;
      ex:birthYear "2003"^^xsd:integer ;
      ex:hasSupervisor ex:emp002 ;
      ex:hasWage "14.20"^^xsd:double .
    ex:emp007
      a ex:Employee ;
      ex:hasID "000-77-3232" ;
      ex:hasTitle "Sales Manager" ;
      ex:hasTitle "Vice President" ;
      ex:employeeType "Manager" ;
      ex:birthYear "1962"^^xsd:integer ;
      ex:hasSupervisor ex:emp001 ;
      ex:hasSalary "80000"^^xsd:double .
    ex:emp008
      a ex:Employee ;
      ex:hasID "000-31-4868" ;
      ex:hasTitle "Fitter" ;
      ex:employeeType "Worker" ;
      ex:birthYear "2008"^^xsd:integer ;
      ex:hasSupervisor ex:emp002 ;
      ex:hasWage "15.00"^^xsd:double .
    ex:emp009
      a ex:Employee ;
      ex:hasID "000-56-3336" ;
      ex:hasTitle "Fitter" ;
      ex:employeeType "Contractor" ;
      ex:birthYear "2001"^^xsd:integer ;
      ex:hasSupervisor ex:emp002 ;
      ex:hasWage "15.00"^^xsd:double .
  }
}

Query Mode Example

The following example performs validation on the sample data graph above. The validation is done in query mode, where results are returned in tabular format:

PREFIX sh: <http://www.w3.org/ns/shacl#>
PREFIX azg: <http://anzograph.com/>
USING azg:employeeShapes
VALIDATE azg:employeeData

The results show that there are 5 violations:

focusNode                 | resultPath                      | value  | constraint                                                 | violation                            | sourceShape    | message
--------------------------+---------------------------------+--------+------------------------------------------------------------+--------------------------------------+----------------+---------------------------------------------------------
http://example.org/emp003 | http://example.org/hasSalary    |   4000 | http://www.w3.org/ns/shacl#MinInclusiveConstraintComponent | http://www.w3.org/ns/shacl#Violation | _:b10737418398 | Salary must be 30,000 or higher
http://example.org/emp004 | http://example.org/hasID        |      0 | http://www.w3.org/ns/shacl#PatternConstraintComponent      | http://www.w3.org/ns/shacl#Violation | _:b15032385677 | Every employee must have an ID that matches the pattern
http://example.org/emp006 | http://example.org/employeeType | Intern | http://www.w3.org/ns/shacl#InConstraintComponent           | http://www.w3.org/ns/shacl#Violation | _:b6442451086  | Every employee is a manager, worker, or contractor
http://example.org/emp006 | http://example.org/hasWage      |   14.2 | http://www.w3.org/ns/shacl#MinInclusiveConstraintComponent | http://www.w3.org/ns/shacl#Violation | _:b10737418399 | Wage must be at least 15.00
http://example.org/emp008 | http://example.org/birthYear    |   2008 | http://www.w3.org/ns/shacl#MaxInclusiveConstraintComponent | http://www.w3.org/ns/shacl#Violation | _:b6442451087  | Birth year must be 2007 or earlier
5 rows

For each violation, the focusNode (subject), resultPath (predicate), value, constraint, violation, sourceShape, and message (if one exists for the shape) is shown. In the first row, employee 3 has a salary of $4,000, which violates the MinInclusiveConstraintComponent that says salaries must be at least $30,000. In the second row, employee 4 has an ID value that violates PatternConstraintComponent because it is too short. Rows 3 and 4 show that employee 6 has an invalid employee type and a wage that is too low. And row 5 shows that employee 8 does not meet the age requirement.

Report Mode Example

The following example performs validation on the sample data graph above. The validation is done in report mode, where the results are saved to a graph rather than returned in tabular format:

PREFIX sh: <http://www.w3.org/ns/shacl#>
PREFIX azg: <http://anzograph.com/>
USING azg:employeeShapes
VALIDATE azg:employeeData
CREATE REPORT GRAPH azg:employeeReport

When the query is complete, you can query the new graph to view the results. First you can run a simple ASK query to see whether or not the graph conforms to the shapes. For example, the query below asks whether the value of <http://www.w3.org/ns/shacl#conforms> is t (true). If the value is f (false), the ASK query returns false:

PREFIX sh: <http://www.w3.org/ns/shacl#>
ASK FROM <http://anzograph.com/employeeReport> { ?s sh:conforms "t" .}
false

If the data graph does not conform to the shapes graph, you can write additional queries to return information about the violations. For example:

PREFIX sh: <http://www.w3.org/ns/shacl#>
SELECT ?focusNode ?resultPath ?value ?constraint ?violation ?sourceShape ?message
FROM <http://anzograph.com/employeeReport>
WHERE {
  ?s sh:focusNode ?focusNode ;
     sh:resultPath ?resultPath ;
     sh:value ?value ;
     sh:sourceConstraintComponent ?constraint ;
     sh:resultSeverity ?violation ;
     sh:sourceShape ?sourceShape ;
     sh:resultMessage ?message .
  }
ORDER BY ?focusNode
LIMIT 100

The results show that there are 5 violations:

focusNode                 | resultPath                      | value  | constraint                                                 | violation                            | sourceShape    | message
--------------------------+---------------------------------+--------+------------------------------------------------------------+--------------------------------------+----------------+---------------------------------------------------------
http://example.org/emp003 | http://example.org/hasSalary    |   4000 | http://www.w3.org/ns/shacl#MinInclusiveConstraintComponent | http://www.w3.org/ns/shacl#Violation | _:b10737418398 | Salary must be 30,000 or higher
http://example.org/emp004 | http://example.org/hasID        |      0 | http://www.w3.org/ns/shacl#PatternConstraintComponent      | http://www.w3.org/ns/shacl#Violation | _:b15032385677 | Every employee must have an ID that matches the pattern
http://example.org/emp006 | http://example.org/employeeType | Intern | http://www.w3.org/ns/shacl#InConstraintComponent           | http://www.w3.org/ns/shacl#Violation | _:b6442451086  | Every employee is a manager, worker, or contractor
http://example.org/emp006 | http://example.org/hasWage      |   14.2 | http://www.w3.org/ns/shacl#MinInclusiveConstraintComponent | http://www.w3.org/ns/shacl#Violation | _:b10737418399 | Wage must be at least 15.00
http://example.org/emp008 | http://example.org/birthYear    |   2008 | http://www.w3.org/ns/shacl#MaxInclusiveConstraintComponent | http://www.w3.org/ns/shacl#Violation | _:b6442451087  | Birth year must be 2007 or earlier
5 rows

For each violation, the focusNode (subject), resultPath (predicate), value, constraint, violation, sourceShape, and message (if one exists for the shape) is shown. In the first row, employee 3 has a salary of $4,000, which violates the MinInclusiveConstraintComponent that says salaries must be at least $30,000. In the second row, employee 4 has an ID value that violates PatternConstraintComponent because it is too short. Rows 3 and 4 show that employee 6 has an invalid employee type and a wage that is too low. And row 5 shows that employee 8 does not meet the age requirement.