Create a Labeled Property Graph (RDF-star)

AnzoGraph DB supports the Labeled Property Graph (LPG) model for adding metadata about the relationships in your graphs. Properties that express values such as start and end dates, data provenance tracking, or the weight, score, or veracity of the data can be added to a graph to further define any of the relationships in the data.

AnzoGraph DB's LPG implementation follows the proposed RDF-star and SPARQL-star extension to the W3C SPARQL query language and RDF data model specifications. The proposal, called RDF-star and SPARQL-star, is a work in progress. The syntax described in the document may not be included in the final specification, and AnzoGraph DB does not support all of the examples included in the proposal at this time.

This topic provides information about loading and inserting properties and querying property graphs.

Defining Properties in Turtle Load Files

This section provides information about how to create a property graph by defining relationship properties in a Turtle load file. For instructions on creating properties in INSERT queries, see Defining Properties in INSERT Queries below.

There is a limit of 255 total property values per edge. AnzoGraph DB returns an Element larger than allowed - too many properties error if you attempt to load or insert more than 255 property values for the same relationship.

To define a relationship property in a Turtle file, wrap the triplet in double arrow heads ( << >>), and then specify the property URI and value at the end of the triplet:

<< <subject> <predicate> <object> >> <property_URI> <property_value> .

For example, the TTL file contents below include properties that further define the like, dislike, and friend relationships in the triples. The file adds a weight property to define how much person3 likes or dislikes certain types of events, and the file adds startDate and endDate properties to friend predicates to define the start and end dates of friendships.

By default, the sample Tickit data set already includes startDate and endDate properties for the friend predicates. The example below defines start and end date properties only for illustrative purposes.

@prefix tickit: <http://anzograph.com/tickit/> .

tickit:person3
  rdf:type tickit:person ;
  tickit:card "4984932249480735"^^xsd:long ;
  tickit:birthday "1963-07-02"^^xsd:date ;
  tickit:ssn 503703220 ;
  tickit::firstname "Lars" ;
  tickit:lastname "Ratliff" ;
  tickit:city "High Point" ;
  tickit:state "NY" ;
  tickit:email "amet.faucibus.ut@condimentumegetvolutpat.ca" ;
  tickit:phone "(624) 767-2465" .
<< tickit:person3 tickit:like "sports" >> tickit:weight 8 .
<< tickit:person3 tickit:like "rock" >> tickit:weight 9 .
<< tickit:person3 tickit:like "musicals" >> tickit:weight 4 .
<< tickit:person3 tickit:dislike "theatre" >> tickit:weight 5 .
<< tickit:person3 tickit:dislike "jazz" >> tickit:weight 9 .
<< tickit:person3 tickit:dislike "opera" >> tickit:weight 10 .
<< tickit:person3 tickit:friend tickit:person8563 >> tickit:startDate "1990-01-04"^^xsd:date .
<< tickit:person3 tickit:friend tickit:person38436 >> tickit:startDate "2000-04-27"^^xsd:date .
<< tickit:person3 tickit:friend tickit:person11979 >> tickit:startDate "2004-11-09"^^xsd:date .
<< tickit:person3 tickit:friend tickit:person11979 >> tickit:endDate "2012-07-17"^^xsd:date .
  tickit:person3 tickit:friend tickit:person8639,tickit:person18536,tickit:person42975,tickit:person47376,
  tickit:person1692,tickit:person2556,tickit:person11979,tickit:person20860,tickit:person21259,tickit:person26586,
  tickit:person27529,tickit:person31735,tickit:person36264,tickit:person38436,tickit:person42306,tickit:person42975 .

The example above contains both compact and long Turtle notation. When defining properties in files, tuples that contain properties must include the complete reference triple (subject, predicate, and object). Properties cannot be added to triples specified in compact notation. In addition, specify one property per triplet. To define multiple properties for the same triplet, list the triplet multiple times. For example, the following lines in the example above define two properties (startDate and endDate) for the person3 friend person11979 triple:

<< tickit:person3 tickit:friend tickit:person11979 >> tickit:startDate "2004-11-09"^^xsd:date .
<< tickit:person3 tickit:friend tickit:person11979 >> tickit:endDate "2012-07-17"^^xsd:date .

Defining Properties in INSERT Queries

Users can create property graphs using INSERT and INSERT DATA syntax to insert triples and properties or add properties to existing triples. To define properties in INSERT statements, use the same syntax as Turtle files: wrap triplets in double arrow heads ( << >>), and then specify the property URI and value for that triple at the end of the triplet.

<< <subject> <predicate> <object> >> <property_URI> <property_value> .

There is a limit of 255 total property values per edge. AnzoGraph DB returns an Element larger than allowed - too many properties error if you attempt to load or insert more than 255 property values for the same relationship.

For example, the INSERT DATA statement below adds weight properties to the like and dislike predicates for person3. This example specifies literal values for weight property.

PREFIX tickit: <http://anzograph.com/tickit/>
INSERT DATA { GRAPH <http://anzograph.com/tickit> {
  << tickit:person3 tickit:dislike "jazz" >> tickit:weight 9 .
  << tickit:person3 tickit:dislike "theatre" >> tickit:weight 5 .
  << tickit:person3 tickit:dislike "opera" >> tickit:weight 10 .
  << tickit:person3 tickit:like "sports" >> tickit:weight 8 .
  << tickit:person3 tickit:like "rock" >> tickit:weight 9 .
  << tickit:person3 tickit:like "musicals" >> tickit:weight 4 .
 }
}

The following example INSERT statement queries the Tickit graph to find the sellers whose total sales amount is greater than or equal to $20,000. For each seller who meets the requirement, the INSERT clause inserts an earned predicate with a property named score and a score value of 10:

PREFIX tickit: <http://anzograph.com/tickit/>
INSERT {GRAPH <http://anzograph.com/tickit> { 
  << ?person tickit:earned ?earned >> tickit:score 10
  }
}
WHERE {GRAPH <http://anzograph.com/tickit> {
  { SELECT ?person (SUM(?dollars) AS ?earned)
    WHERE { 
      ?person tickit:firstname ?first .
      ?person tickit:lastname ?last .
      ?sale tickit:sellerid ?person .
      ?sale tickit:pricepaid ?dollars .
  }
  GROUP BY ?person
  }
  FILTER(?earned >= 20000)
 }
}

Selecting the newly created triples shows that 52 people met the requirement and were assigned a <score> property with a value of 10:

PREFIX tickit: <http://anzograph.com/tickit/>
SELECT ?person ?earned ?score 
FROM <http://anzograph.com/tickit>
WHERE {
   << ?person tickit:earned ?earned >> tickit:score ?score .
}
ORDER BY ?person
person                                  | earned | score
----------------------------------------+--------+-------
http://anzograph.com/tickit/person11168 |  21036 |    10
http://anzograph.com/tickit/person1140  |  32399 |    10
http://anzograph.com/tickit/person12263 |  20320 |    10
http://anzograph.com/tickit/person12646 |  22194 |    10
http://anzograph.com/tickit/person13385 |  28495 |    10
http://anzograph.com/tickit/person15976 |  20929 |    10
http://anzograph.com/tickit/person16008 |  20515 |    10
http://anzograph.com/tickit/person16335 |  20160 |    10
http://anzograph.com/tickit/person18005 |  20918 |    10
http://anzograph.com/tickit/person19231 |  22636 |    10
http://anzograph.com/tickit/person19814 |  20465 |    10
http://anzograph.com/tickit/person20029 |  20103 |    10
http://anzograph.com/tickit/person23635 |  20265 |    10
http://anzograph.com/tickit/person2372  |  27159 |    10
http://anzograph.com/tickit/person24980 |  24857 |    10
http://anzograph.com/tickit/person25433 |  27653 |    10
http://anzograph.com/tickit/person26198 |  21243 |    10
...
52 rows

The following example shows how to create properties and assign values based on data that exists in a source file. The data for the example is a CSV file with the following columns and data:

Airline,FlightNumber,TailNumber,OriginAirport,DestinationAirport,Distance
AS,98,N407AS,ANC,SEA,1448
AA,2336,N3KUAA,LAX,PBI,2330
US,840,N171US,SFO,CLT,2296
AA,258,N3HYAA,LAX,MIA,2342
AS,135,N527AS,SEA,ANC,1448
DL,806,N3730B,SFO,MSP,1589
NK,612,N635NK,LAS,MSP,1299
US,2013,N584UW,LAX,CLT,2125

The example INSERT query for the file above defines the Distance column as a property and adds the Distance value as the value for the property:

PREFIX s:   <http://cambridgesemantics.com/ontologies/DataToolkit#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

INSERT { GRAPH <http://anzograph.com/flights> { 
     ?OriginIRI a <Airport> .
     ?DestinationIRI a <Airport>  .
     << ?OriginIRI <Destination> ?DestinationIRI >> <Distance> ?Distance .
     ?FlightIRI a <Flight> ;
     <Airline> ?Airline ;
     <FlightNumber> ?FlightNumber ;
     <TailNumber> ?TailNumber .
  }
}
WHERE { 
   SERVICE <http://cambridgesemantics.com/services/DataToolkit> {
     ?data a s:FileSource ;
     s:url "/home/erin/air-lpg.csv" ;
     ?Airline (xsd:string);
     ?FlightNumber (xsd:string);
     ?TailNumber (xsd:string);
     ?OriginAirport (xsd:string);
     ?DestinationAirport (xsd:string);
     ?Distance (xsd:long).
   BIND(IRI("http://anzograph.com/flights/Flight/{{?FlightNumber}}") as ?FlightIRI)
   BIND(IRI("http://anzograph.com/flights/origin/{{?OriginAirport}}") as ?OriginIRI)
   BIND(IRI("http://anzograph.com/flights/destination/{{?DestinationAirport}}") as ?DestinationIRI)
  }
}

The following query returns the origin and destination airports for the flights as well as the distance property value:

SELECT ?from ?to ?distance
FROM <http://anzograph.com/flights>
WHERE {
   << ?from ?p ?to >> ?property ?distance
}
ORDER BY DESC(?distance)
from                                    | to                                           | distance
----------------------------------------+----------------------------------------------+----------
http://anzograph.com/flights/origin/LAX | http://anzograph.com/flights/destination/MIA |     2342
http://anzograph.com/flights/origin/LAX | http://anzograph.com/flights/destination/PBI |     2330
http://anzograph.com/flights/origin/SFO | http://anzograph.com/flights/destination/CLT |     2296
http://anzograph.com/flights/origin/LAX | http://anzograph.com/flights/destination/CLT |     2125
http://anzograph.com/flights/origin/SFO | http://anzograph.com/flights/destination/MSP |     1589
http://anzograph.com/flights/origin/ANC | http://anzograph.com/flights/destination/SEA |     1448
http://anzograph.com/flights/origin/SEA | http://anzograph.com/flights/destination/ANC |     1448
http://anzograph.com/flights/origin/LAS | http://anzograph.com/flights/destination/MSP |     1299

8 rows

Querying Property Graphs

To return properties and their values when analyzing data sets, include the following property graph syntax in graph and triple patterns:

<< <subject> <predicate> <object> >> <property_URI> <property_value> .

The following example query returns the properties that were defined in the INSERT DATA query above.

PREFIX tickit: <http://anzograph.com/tickit/>
SELECT *
FROM <http://anzograph.com/tickit>
WHERE {
  << tickit:person3 ?p ?likes_or_dislikes >> tickit:weight ?value.
  FILTER(?p=tickit:like || ?p=tickit:dislike)
}
ORDER BY ?p
p                                   | likes_or_dislikes | value
------------------------------------+-------------------+-------
http://anzograph.com/tickit/dislike | jazz              |     9
http://anzograph.com/tickit/dislike | opera             |    10
http://anzograph.com/tickit/dislike | theatre           |     5
http://anzograph.com/tickit/like    | musicals          |     4
http://anzograph.com/tickit/like    | rock              |     9
http://anzograph.com/tickit/like    | sports            |     8
6 rows

This example returns a list of the properties in the Tickit graph and lists the number of times each property is referenced in the graph. Note that in addition to the properties that were defined above, the results shown below also include the properties that are defined by default in the sample Tickit data set. See Working with SPARQL and the Tickit Data for instructions on loading the full data set.

SELECT ?property (COUNT(?property) AS ?times_used)
FROM <http://anzograph.com/tickit>
WHERE {
  << ?s ?p ?o >> ?property ?value
}
GROUP BY ?property
ORDER BY desc(?times_used)
 property                          | times_used
-----------------------------------+------------
startDate                          |    1445832
score                              |     241949
endDate                            |     144706
http://anzograph.com/tickit/score  |         52
http://anzograph.com/tickit/weight |          6
5 rows