Model Requirements and Recommendations
Anzo uses models to describe and manage RDF data sets. To ensure that data structures are properly defined, Anzo requires that data models include certain information and avoid unsupported information. This topic provides details about the requirements and guidelines to follow when uploading or creating models.
Requirements
This section lists the requirements or rules to follow when uploading or creating a data model. Models that are generated by Anzo during the auto-ingest process conform to these rules.
- Define each model as an owl:Ontology
- Define the model name with rdfs:label
- The named graph URI must match the ontology URI
- Define classes and concepts with owl:Class
- Define taxonomy with rdfs:subClassOf
- Define properties as owl:DatatypeProperty or owl:ObjectProperty
- Include rdfs:domain and rdfs:range for all properties
- Reference only Anzo-stored models
Define each model as an owl:Ontology
Define each data model as an owl:Ontology. To do so, include the following triple in the model:
<myOntology> a owl:Ontology
Where myOntology is the URI that names the model. The URI must be unique. To avoid unexpected results when saving a model, do not include a hash (#) character at the end of the model URI.
Define the model name with rdfs:label
Use an rdfs:label property to define name of the model as a string. Include the following triple:
<myOntology> rdfs:label "My Ontology"^^xsd:string .
For example, you can use the following statement as a template for inserting owl:Ontology and rdfs:label into the model:
<myOntology> a owl:Ontology ; rdfs:label "My ontology"^^xsd:string .
The named graph URI must match the ontology URI
Make sure that the named graph URI for the model matches the ontology URI. For example:
<myOntology> { <myOntology> a owl:Ontology . }
                                            Like a linked data set, an ontology is a core component that is used throughout the system. The registries that store and track the graphs for core components, such as the ontology registry, expect that each graph contains a resource that matches the graph URI and specifies the type of graph. Having a mismatched graph and ontology URI can break core Anzo functionality.
Define classes and concepts with owl:Class
Use owl:Class for class or concept definitions. Do NOT include skos:Concept or rdfs:Class. For example, the following statement requires modification to make it valid in an Anzo model:
<myConcept> a skos:Concept
Changing the statement as follows correctly uses owl:Class instead of skos:Concept:
<myConcept> a owl:Class ; rdfs:label <businessFacingClassLabel> .
Define taxonomy with rdfs:subClassOf
Use rdfs:subClassOf for taxonomy. Do NOT use skos:broader. For example, the following statement requires modification to make it valid in an Anzo model:
<childSkosConcept> skos:broader <parentSkosConcept> .
Changing the statement as follows correctly uses rdfs:subClassOf instead of skos:broader:
<childOwlClass> rdfs:subClassOf <parentOwlClass> .
Define properties as owl:DatatypeProperty or owl:ObjectProperty
Define properties using owl:DatatypeProperty or owl:ObjectProperty. For example:
<myObjectProperty> a owl:ObjectProperty .
Or
<myDataTypeProperty> a owl:DatatypeProperty .
Include rdfs:domain and rdfs:range for all properties
Define rdfs:domain and rdfs:range for all properties. For example, the following property definition is incomplete:
<myObjectProperty> a owl:ObjectProperty .
The statement below completes the definition by adding rdfs:label, rdfs:domain, and rdfs:range:
<myObjectProperty> a owl:ObjectProperty ; rdfs:label <businessFacingPropertyLabel> ; rdfs:domain <myClass> ; rdfs:range <myOtherClass> .
The example below shows a valid data type definition:
<myDataTypeProperty> a owl:DatatypeProperty ; rdfs:label <businessFacingPropertyLabel> ; rdfs:domain <myClass> ; <myDataTypeProperty> rdfs:range <literal> .
Important: When defining the property range for integer values, use xsd:int instead of xsd:integer.
Reference only Anzo-stored models
Models must be self-contained or include references only to models that are stored in Anzo.
Guidelines
This section lists additional guidelines and important information to know when working with data models in Anzo.
- Property Range Guidelines
- TriG is the preferred format for models to upload
- Load RDFS and OWL vocabularies as graphs
- Axiomatically defined classes and property hierarchies not processed
Property Range Guidelines
When creating or editing properties in the model editor, Anzo offers several RDF property ranges or data types to choose from. Certain types are preferred over others, however, because they are treated consistently and predictably across systems. Cambridge Semantics recommends that you specify one of the following preferred property range values:
- Boolean: For true or false values.
- Byte: For 1-byte integers from -128 to 127.
- Date: For date values that follow a format such as YYYY-MM-DD.
- Date time: For 8-byte date and time values that follow a format such as YYYY-MM-DDThh:mm:ss.
- Double: For up to 8-byte double floating point values.
- Duration: For a duration of time expressed as a number of years, months, days, hours, minutes, and seconds in a format such as PnYnMnDTnHnMnS.
- Float: For up to 4-byte floating point values with potential decimal places.
- Int: For up to 4-byte integers from -2,147,483,648 to 2,147,483,647.
- Long: For up to 8-byte integers from –9,223,372,036,854,775,808 to 9,223,372,036,854,775,807.
- Short: For up to 2-byte integers from -32,768 to 32,767.
- String: For character values of varying length.
- Time: For time values that follow a format such as hh:mm:ss.
TriG is the preferred format for models to upload
The preferred format for models that will be uploaded to Anzo is TriG (.trig) format.
Load RDFS and OWL vocabularies as graphs
Anzo loads but does not process additional vocabulary data (such as rdf:subPropertyOf, owl:sameAs, and owl:intersectionOf, etc.) if they are encoded in models. Models that contain vocabularies rather than structural information should be loaded as RDF graphs instead. Anzo can load any valid RDF data. Since RDFS, SKOS, and OWL are valid RDF formats, the vocabulary information can be loaded as a graph, and the data can be interpreted with SPARQL in data layers and Hi-Res Analytics.
Axiomatically defined classes and property hierarchies not processed
When models include axiomatically defined classes or property hierarchies, Anzo loads the information but does not process the data. For example, Anzo does not infer information from axiomatically defined classes.