Creating Labeled Property Graphs (RDF-star)
AnzoGraph DB supports the Labeled Property Graph (LPG) model for adding metadata about the relationships in your graphs. Properties that express values such as start and end dates, data provenance tracking, or the weight, score, or veracity of the data can be added to a graph to further define any of the relationships in the data.
AnzoGraph DB's LPG implementation follows the proposed RDF-star and SPARQL-star extension to the W3C SPARQL query language and RDF data model specifications. The proposal, called RDF-star and SPARQL-star, is a work in progress. The syntax described in the document may not be included in the final specification, and AnzoGraph DB does not support all of the examples included in the proposal at this time.
This topic provides information about loading and inserting properties and querying property graphs.
- Defining Properties in Turtle Load Files
- Defining Properties in INSERT Queries
- Querying Property Graphs
Defining Properties in Turtle Load Files
This section provides information about how to create a property graph by defining relationship properties in a Turtle load file. For instructions on creating properties in INSERT queries, see Defining Properties in INSERT Queries below.
There is a limit of 255 total property values per edge. AnzoGraph DB returns an Element larger than allowed - too many properties
error if you attempt to load or insert more than 255 property values for the same relationship.
To define a relationship property in a Turtle file, wrap the triplet in double arrow heads ( << >>), and then specify the property URI and value at the end of the triplet:
<< <subject> <predicate> <object> >> <property_URI> <property_value> .
For example, the TTL file contents below include properties that further define the like, dislike, and friend relationships in the triples. The file adds a weight property to define how much person3
likes or dislikes certain types of events, and the file adds startDate and endDate properties to friend
predicates to define the start and end dates of friendships.
By default, the sample Tickit data set already includes startDate and endDate properties for the friend predicates. The example below defines start and end date properties only for illustrative purposes.
@prefix tickit: <http://anzograph.com/tickit/> . tickit:person3 rdf:type tickit:person ; tickit:card "4984932249480735"^^xsd:long ; tickit:birthday "1963-07-02"^^xsd:date ; tickit:ssn 503703220 ; tickit::firstname "Lars" ; tickit:lastname "Ratliff" ; tickit:city "High Point" ; tickit:state "NY" ; tickit:email "amet.faucibus.ut@condimentumegetvolutpat.ca" ; tickit:phone "(624) 767-2465" . << tickit:person3 tickit:like "sports" >> tickit:weight 8 . << tickit:person3 tickit:like "rock" >> tickit:weight 9 . << tickit:person3 tickit:like "musicals" >> tickit:weight 4 . << tickit:person3 tickit:dislike "theatre" >> tickit:weight 5 . << tickit:person3 tickit:dislike "jazz" >> tickit:weight 9 . << tickit:person3 tickit:dislike "opera" >> tickit:weight 10 . << tickit:person3 tickit:friend tickit:person8563 >> tickit:startDate "1990-01-04"^^xsd:date . << tickit:person3 tickit:friend tickit:person38436 >> tickit:startDate "2000-04-27"^^xsd:date . << tickit:person3 tickit:friend tickit:person11979 >> tickit:startDate "2004-11-09"^^xsd:date . << tickit:person3 tickit:friend tickit:person11979 >> tickit:endDate "2012-07-17"^^xsd:date . tickit:person3 tickit:friend tickit:person8639,tickit:person18536,tickit:person42975,tickit:person47376, tickit:person1692,tickit:person2556,tickit:person11979,tickit:person20860,tickit:person21259,tickit:person26586, tickit:person27529,tickit:person31735,tickit:person36264,tickit:person38436,tickit:person42306,tickit:person42975 .
The example above contains both compact and long Turtle notation. When defining properties in files, tuples that contain properties must include the complete reference triple (subject, predicate, and object). Properties cannot be added to triples specified in compact notation. In addition, specify one property per triplet. To define multiple properties for the same triplet, list the triplet multiple times. For example, the following lines in the example above define two properties (startDate and endDate) for the person3 friend person11979
triple:
<< tickit:person3 tickit:friend tickit:person11979 >> tickit:startDate "2004-11-09"^^xsd:date . << tickit:person3 tickit:friend tickit:person11979 >> tickit:endDate "2012-07-17"^^xsd:date .
The IO Load service does not support loading files that contain RDF* syntax. To load files that include RDF*, use SPARQL LOAD as decribed in Loading Local RDF Files with SPARQL LOAD.
Defining Properties in INSERT Queries
Users can create property graphs using INSERT and INSERT DATA syntax to insert triples and properties or add properties to existing triples. To define properties in INSERT statements, use the same syntax as Turtle files: wrap triplets in double arrow heads ( << >>), and then specify the property URI and value for that triple at the end of the triplet.
<< <subject> <predicate> <object> >> <property_URI> <property_value> .
There is a limit of 255 total property values per edge. AnzoGraph DB returns an Element larger than allowed - too many properties
error if you attempt to load or insert more than 255 property values for the same relationship.
For example, the INSERT DATA statement below adds weight properties to the like and dislike predicates for person3. This example specifies literal values for weight property.
PREFIX tickit: <http://anzograph.com/tickit/> INSERT DATA { GRAPH <http://anzograph.com/tickit> { << tickit:person3 tickit:dislike "jazz" >> tickit:weight 9 . << tickit:person3 tickit:dislike "theatre" >> tickit:weight 5 . << tickit:person3 tickit:dislike "opera" >> tickit:weight 10 . << tickit:person3 tickit:like "sports" >> tickit:weight 8 . << tickit:person3 tickit:like "rock" >> tickit:weight 9 . << tickit:person3 tickit:like "musicals" >> tickit:weight 4 . } }
The following example INSERT statement queries the Tickit graph to find the sellers whose total sales amount is greater than or equal to $20,000. For each seller who meets the requirement, the INSERT clause inserts an earned predicate with a property named score and a score value of 10:
PREFIX tickit: <http://anzograph.com/tickit/> INSERT {GRAPH <http://anzograph.com/tickit> { << ?person tickit:earned ?earned >> tickit:score 10 } } WHERE {GRAPH <http://anzograph.com/tickit> { { SELECT ?person (SUM(?dollars) AS ?earned) WHERE { ?person tickit:firstname ?first . ?person tickit:lastname ?last . ?sale tickit:sellerid ?person . ?sale tickit:pricepaid ?dollars . } GROUP BY ?person } FILTER(?earned >= 20000) } }
Selecting the newly created triples shows that 52 people met the requirement and were assigned a <score> property with a value of 10:
PREFIX tickit: <http://anzograph.com/tickit/> SELECT ?person ?earned ?score FROM <http://anzograph.com/tickit> WHERE { << ?person tickit:earned ?earned >> tickit:score ?score . } ORDER BY ?person
person | earned | score ----------------------------------------+--------+------- http://anzograph.com/tickit/person11168 | 21036 | 10 http://anzograph.com/tickit/person1140 | 32399 | 10 http://anzograph.com/tickit/person12263 | 20320 | 10 http://anzograph.com/tickit/person12646 | 22194 | 10 http://anzograph.com/tickit/person13385 | 28495 | 10 http://anzograph.com/tickit/person15976 | 20929 | 10 http://anzograph.com/tickit/person16008 | 20515 | 10 http://anzograph.com/tickit/person16335 | 20160 | 10 http://anzograph.com/tickit/person18005 | 20918 | 10 http://anzograph.com/tickit/person19231 | 22636 | 10 http://anzograph.com/tickit/person19814 | 20465 | 10 http://anzograph.com/tickit/person20029 | 20103 | 10 http://anzograph.com/tickit/person23635 | 20265 | 10 http://anzograph.com/tickit/person2372 | 27159 | 10 http://anzograph.com/tickit/person24980 | 24857 | 10 http://anzograph.com/tickit/person25433 | 27653 | 10 http://anzograph.com/tickit/person26198 | 21243 | 10 ... 52 rows
The following example shows how to create properties and assign values based on data that exists in a source file. The data for the example is a CSV file with the following columns and data:
Airline,FlightNumber,TailNumber,OriginAirport,DestinationAirport,Distance AS,98,N407AS,ANC,SEA,1448 AA,2336,N3KUAA,LAX,PBI,2330 US,840,N171US,SFO,CLT,2296 AA,258,N3HYAA,LAX,MIA,2342 AS,135,N527AS,SEA,ANC,1448 DL,806,N3730B,SFO,MSP,1589 NK,612,N635NK,LAS,MSP,1299 US,2013,N584UW,LAX,CLT,2125
The example INSERT query for the file above defines the Distance column as a property and adds the Distance value as the value for the property:
PREFIX s: <http://cambridgesemantics.com/ontologies/DataToolkit#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> INSERT { GRAPH <http://anzograph.com/flights> { ?OriginIRI a <Airport> . ?DestinationIRI a <Airport> . << ?OriginIRI <Destination> ?DestinationIRI >> <Distance> ?Distance . ?FlightIRI a <Flight> ; <Airline> ?Airline ; <FlightNumber> ?FlightNumber ; <TailNumber> ?TailNumber . } } WHERE { SERVICE <http://cambridgesemantics.com/services/DataToolkit> { ?data a s:FileSource ; s:url "/home/erin/air-lpg.csv" ; ?Airline (xsd:string); ?FlightNumber (xsd:string); ?TailNumber (xsd:string); ?OriginAirport (xsd:string); ?DestinationAirport (xsd:string); ?Distance (xsd:long). BIND(IRI("http://anzograph.com/flights/Flight/{{?FlightNumber}}") as ?FlightIRI) BIND(IRI("http://anzograph.com/flights/origin/{{?OriginAirport}}") as ?OriginIRI) BIND(IRI("http://anzograph.com/flights/destination/{{?DestinationAirport}}") as ?DestinationIRI) } }
The following query returns the origin and destination airports for the flights as well as the distance property value:
SELECT ?from ?to ?distance FROM <http://anzograph.com/flights> WHERE { << ?from ?p ?to >> ?property ?distance } ORDER BY DESC(?distance)
from | to | distance ----------------------------------------+----------------------------------------------+---------- http://anzograph.com/flights/origin/LAX | http://anzograph.com/flights/destination/MIA | 2342 http://anzograph.com/flights/origin/LAX | http://anzograph.com/flights/destination/PBI | 2330 http://anzograph.com/flights/origin/SFO | http://anzograph.com/flights/destination/CLT | 2296 http://anzograph.com/flights/origin/LAX | http://anzograph.com/flights/destination/CLT | 2125 http://anzograph.com/flights/origin/SFO | http://anzograph.com/flights/destination/MSP | 1589 http://anzograph.com/flights/origin/ANC | http://anzograph.com/flights/destination/SEA | 1448 http://anzograph.com/flights/origin/SEA | http://anzograph.com/flights/destination/ANC | 1448 http://anzograph.com/flights/origin/LAS | http://anzograph.com/flights/destination/MSP | 1299 8 rows
Querying Property Graphs
To return properties and their values when analyzing data sets, include the following property graph syntax in graph and triple patterns:
<< <subject> <predicate> <object> >> <property_URI> <property_value> .
The following example query returns the properties that were defined in the INSERT DATA query above.
PREFIX tickit: <http://anzograph.com/tickit/> SELECT * FROM <http://anzograph.com/tickit> WHERE { << tickit:person3 ?p ?likes_or_dislikes >> tickit:weight ?value. FILTER(?p=tickit:like || ?p=tickit:dislike) } ORDER BY ?p
p | likes_or_dislikes | value ------------------------------------+-------------------+------- http://anzograph.com/tickit/dislike | jazz | 9 http://anzograph.com/tickit/dislike | opera | 10 http://anzograph.com/tickit/dislike | theatre | 5 http://anzograph.com/tickit/like | musicals | 4 http://anzograph.com/tickit/like | rock | 9 http://anzograph.com/tickit/like | sports | 8 6 rows
This example returns a list of the properties in the Tickit graph and lists the number of times each property is referenced in the graph. Note that in addition to the properties that were defined above, the results shown below also include the properties that are defined by default in the sample Tickit data set. See Working with SPARQL and the Tickit Data for instructions on loading the full data set.
SELECT ?property (COUNT(?property) AS ?times_used) FROM <http://anzograph.com/tickit> WHERE { << ?s ?p ?o >> ?property ?value } GROUP BY ?property ORDER BY desc(?times_used)
property | times_used -----------------------------------+------------ startDate | 1445832 score | 241949 endDate | 144706 http://anzograph.com/tickit/score | 52 http://anzograph.com/tickit/weight | 6 5 rows