Introduction to SHACL

SHACL is a modeling language for describing a set of conditions and constraints that data in knowledge graphs must follow. The conditions are defined in structures called SHACL shapes, which are in the form of RDF graphs called shapes graphs. The graphs that are validated against shapes graphs are called data graphs.

Targets in the shapes graphs define the nodes, classes, and/or properties in the data graphs that must conform to the shape, and constraints define how to validate the targeted data. There are two types of shapes graphs: node shapes and property shapes. Node shapes define constraints on focus nodes, and property shapes define constraints on the values for properties that are connected to the focus nodes.

Shape Requirements

A shape is a URI or blank node that meets at least one of the following conditions in the shapes graph:

  • The shape is an instance of sh:NodeShape or sh:PropertyShape.
  • The shape has at least one of the following predicates: sh:targetClass, sh:targetNode, sh:targetObjectsOf or sh:targetSubjectsOf.
  • The shape has a sh:property [ parameter_list ] predicate.
  • The shape is a value of any of the constraint components described in Constraint Component Reference.

Data Validation

The validation processor in Graph Lakehouse is invoked by running a SPARQL query. The processor validates one or more data graphs against the constraints defined in one or more shapes graphs and produces a report in the form of a validation graph. For more information, see Validate a Data Graph.