Data Linking Options
When a data source does not define keys (such as a CSV or JSON source), the GDI provides properties that enable you to create a connected knowledge graph by defining relationships, resource templates (primary keys) and object properties (foreign keys), when you are loading data from multiple sources. The properties that are available are described below.
Data Linking Syntax
s:key ("column_name") ; s:reference [ s:model "table_to_reference" ; s:using ("foreign_key_column") ]
Data Linking Examples
For example, the query snippet below defines two data sources. The s:model
property defines the table/class for each source, and the s:key
defines the primary key for each table/class. The s:reference
property for the "venue" table defines a foreign key relationship from venue.EVENT_ID
to event.EVENT_ID
.
?event a s:FileSource ; s:model "event" ; s:url "/opt/shared-files/csv/events.csv" ; s:key ("EVENT_ID") . ?venue a s:FileSource ; s:model "venue" ; s:url " /opt/shared-files/csv/venues.csv" ; s:key ("VENUE_ID") ; s:reference [ s:model "event" ; s:using ("EVENT_ID") ] .
The following query for multiple file sources generates RDF and an ontology with resource templates and object properties. The query also includes global normalization rules for normalizing the data across all sources (see Normalization Options for information about normalization).
PREFIX s: <http://cambridgesemantics.com/ontologies/DataToolkit#> INSERT { GRAPH ${targetGraph} { ?s ?p ?o . } } WHERE { SERVICE <http://cambridgesemantics.com/services/DataToolkit> { ?event a s:FileSource ; s:model "event" ; s:url "/opt/shared-files/csv/events.csv" ; s:key ("EVENT_ID") . ?listing a s:FileSource ; s:model "listing" ; s:url " /opt/shared-files/csv/listings.csv" ; s:key ("LIST_ID") ; s:reference [ s:model "event" ; s:using ("EVENT_ID") ; s:key ("EVENT_ID") ] . ?date a s:FileSource ; s:model "date" ; s:url "/opt/shared-files/csv/event_dates.csv" ; s:key ("DATE_ID") ; s:reference [ s:model "event" ; s:using ("EVENT_ID") ; s:key ("EVENT_ID") ] . ?venue a s:FileSource ; s:model "venue" ; s:url " /opt/shared-files/csv/venues.csv" ; s:key ("VENUE_ID") ; s:reference [ s:model "event" ; s:using ("EVENT_ID") ; s:key ("EVENT_ID") ] . ?sale a s:FileSource ; s:model "sale" ; s:url " /opt/shared-files/csv/sales.csv" ; s:key ("SALE_ID") ; s:reference [ s:model "event" ; s:using ("EVENT_ID") ; s:key ("EVENT_ID") ] ; s:reference [ s:model "listing" ; s:using ("LIST_ID") ; s:key ("LIST_ID") ] . ?rdf a s:RdfGenerator, s:OntologyGenerator ; s:as (?s ?p ?o) ; s:ontology <http://cambridgesemantics.com/tickets> ; s:base <http://cambridgesemantics.com/data> ; s:normalize [ s:all [ s:casing s:UPPER ; s:localNameSeparator "_" ; ] ; ] . } }