Creating and Querying Labeled Property Graphs (RDF*)

AnzoGraph supports the Labeled Property Graph (LPG) model for adding metadata about the relationships in your graphs. Properties that express values such as start and end dates, data provenance tracking, or the weight, score, or veracity of the data can be added to a graph to further define any of the relationships in the data.

AnzoGraph's LPG implementation follows the proposed RDF* and SPARQL* extension to the W3C SPARQL query language and RDF data model specifications. The proposal, called Foundations of an Alternative Approach to Reification in RDF, is a work in progress, and Cambridge Semantics is a contributor to the working group. The syntax described in the document may not be included in the final specification, and AnzoGraph does not support all of the examples included in the proposal at this time. To view the working draft of the RDF* and SPARQL* specification, click here.

This topic provides information about loading and inserting properties and querying property graphs.

Defining Properties in Turtle Load Files

This section provides information about how to create a property graph by defining relationship properties in a Turtle load file. For instructions on creating properties in INSERT queries, see Defining Properties in INSERT Queries below.

There is a limit of 255 total property values per edge. AnzoGraph returns an error if you attempt to load or insert more than 255 property values for the same relationship.

To define a relationship property in a Turtle file, wrap the triplet in double arrow heads ( << >>), and then specify the property URI and value at the end of the triplet:

<< <subject> <predicate> <object> >> <property_URI> <property_value> .

For example, the TTL file contents below include properties that further define the like, dislike, and friend relationships in the triples. The file adds a weight property to define how much <person3> likes or dislikes certain types of events, and the file adds startDate and endDate properties to <friend> predicates to define the start and end dates of friendships.

<person3>
  rdf:type <person>;
  <card> "4984932249480735"^^xsd:long;
  <birthday> "1963-07-02"^^xsd:date;
  <ssn> 503703220;
  <firstname> "Lars";
  <lastname> "Ratliff";
  <city> "High Point";
  <state> "NY";
  <email> "amet.faucibus.ut@condimentumegetvolutpat.ca";
  <phone> "(624) 767-2465".
<< <person3> <like> "sports">> <weight> 8.
<< <person3> <like> "rock">> <weight> 9.
<< <person3> <like> "musicals">> <weight> 4.
<< <person3> <dislike> "theatre">> <weight> 5.
<< <person3> <dislike> "jazz">> <weight> 9.
<< <person3> <dislike> "opera">> <weight> 10.
<< <person3> <friend> <person8563> >> <startDate> "1990-01-04"^^xsd:date.
<< <person3> <friend> <person38436> >> <startDate> "2000-04-27"^^xsd:date.
<< <person3> <friend> <person11979> >> <startDate> "2004-11-09"^^xsd:date.
<< <person3> <friend> <person11979> >> <endDate> "2012-07-17"^^xsd:date.
<person3> <friend> <person8639>,<person18536>,<person42975>,<person47376>,
  <person1692>,<person2556>,<person11979>,<person20860>,<person21259>,<person26586>,
  <person27529>,<person31735>,<person36264>,<person38436>,<person42306>,<person42975>.

The example above contains both compact and long Turtle notation. When defining properties in files, tuples that contain properties must include the complete reference triple (subject, predicate, and object). Properties cannot be added to triples specified in compact notation. In addition, specify one property per triplet. To define multiple properties for the same triplet, list the triplet multiple times. For example, the following lines in the example above define two properties (startDate and endDate) for the person3 friend person11979 triple:

<< <person3> <friend> <person11979> >> <startDate> "2004-11-09"^^xsd:date.
<< <person3> <friend> <person11979> >> <endDate> "2012-07-17"^^xsd:date.

By default, the sample Tickit data set already includes startDate and endDate properties for the friend predicates. The example above defines start and end date properties only for illustrative purposes.

Defining Properties in INSERT Queries

Users can create property graphs using INSERT and INSERT DATA syntax to insert triples and properties or add properties to existing triples.

Creating a property graph with a CONSTRUCT query is not supported at this time.

To define properties in INSERT statements, use the same syntax as Turtle files: wrap triplets in double arrow heads ( << >>), and then specify the property URI and value for that triple at the end of the triplet.

<< <subject> <predicate> <object> >> <property_URI> <property_value> .

There is a limit of 255 total property values per edge. AnzoGraph returns an error if you attempt to load or insert more than 255 property values for the same relationship.

For example, the INSERT DATA statement below adds weight properties to the like and dislike predicates for person3. This example specifies literal values for weight property.

INSERT DATA { GRAPH <tickit> {
  << <person3> <dislike> "jazz" >> <weight> 9 .
  << <person3> <dislike> "theatre" >> <weight> 5 .
  << <person3> <dislike> "opera" >> <weight> 10 .
  << <person3> <like> "sports" >> <weight> 8 .
  << <person3> <like> "rock" >> <weight> 9 .
  << <person3> <like> "musicals" >> <weight> 4 .
 }
}

The following example INSERT statement queries the Tickit graph to find the sellers whose total sales amount is greater than or equal to $20,000. For each seller who meets the requirement, the INSERT clause inserts an earned predicate with a property named score and a score value of 10:

INSERT {GRAPH <tickit> { 
  <<?person <earned> ?earned>> <score> 10
  }
}
WHERE {GRAPH <tickit> {
  { SELECT ?person (SUM(?dollars) AS ?earned)
    WHERE { 
      ?person <firstname> ?first .
      ?person <lastname> ?last .
      ?sale <sellerid> ?person .
      ?sale <pricepaid> ?dollars .
  }
  GROUP BY ?person
  }
  FILTER(?earned >= 20000)
 }
}

Selecting the newly created triples shows that 52 people met the requirement and were assigned a <score> property with a value of 10:

SELECT ?person ?earned ?score 
FROM <tickit>
WHERE {
   <<?person <earned> ?earned>> <score> ?score
}
ORDER BY ?person
person      | earned       | score
------------+--------------+-------
person19231 | 22636.000000 |    10
person30007 | 20521.000000 |    10
person16335 | 20160.000000 |    10
person15976 | 20929.000000 |    10
person49919 | 21218.000000 |    10
person30764 | 21014.000000 |    10
person24980 | 24857.000000 |    10
person8038  | 20015.000000 |    10
person36217 | 24269.000000 |    10
person26198 | 21243.000000 |    10
person1140  | 32399.000000 |    10
person35284 | 20131.000000 |    10
person34730 | 20448.000000 |    10
person19814 | 20465.000000 |    10
person34982 | 22262.000000 |    10
...
52 rows

The following example shows how to create properties and assign values based on data that exists in a source file. The data for the example is a CSV file with the following columns and data:

Airline,FlightNumber,TailNumber,OriginAirport,DestinationAirport,Distance
AS,98,N407AS,ANC,SEA,1448
AA,2336,N3KUAA,LAX,PBI,2330
US,840,N171US,SFO,CLT,2296
AA,258,N3HYAA,LAX,MIA,2342
AS,135,N527AS,SEA,ANC,1448
DL,806,N3730B,SFO,MSP,1589
NK,612,N635NK,LAS,MSP,1299
US,2013,N584UW,LAX,CLT,2125

The example INSERT query for the file above defines the Distance column as a property and adds the Distance value as the value for the property:

INSERT { GRAPH <flights> {
  ?OriginIRI  a <Airport> .
  ?DestinationIRI a <Airport>  .
<< ?OriginIRI <Destination> ?DestinationIRI >> <Distance> ?Distance .
  ?FlightIRI a <Flight> ;
    <Airline> ?Airline ;
    <FlightNumber> ?FlightNumber ;
    <TailNumber> ?TailNumber .
  }
}
WHERE { TABLE <file:/home/user/flights.csv>>
('csv','global',',',true,'Airline:char,FlightNumber:char,TailNumber:char,
OriginAirport:char,DestinationAirport:char,Distance:int')
  BIND(IRI(CONCAT(CONCAT("Flight",str(?FlightNumber),str(?TailNumber)))) as ?FlightIRI)
  BIND(IRI(str(?OriginAirport)) as ?OriginIRI)
  BIND(IRI(str(?DestinationAirport)) as ?DestinationIRI)
}

The following query returns the origin and destination airports for the flights as well as the distance property value:

SELECT ?from ?to ?distance
FROM <flights>
WHERE {
   << ?from ?p ?to >> ?property ?distance
}
ORDER BY ?distance
from | to  | distance
-----+-----+----------
LAS  | MSP |     1299
SEA  | ANC |     1448
ANC  | SEA |     1448
SFO  | MSP |     1589
LAX  | CLT |     2125
SFO  | CLT |     2296
LAX  | PBI |     2330
LAX  | MIA |     2342
8 rows

Querying Property Graphs

To return properties and their values when analyzing data sets, include the following property graph syntax in graph and triple patterns:

<< <subject> <predicate> <object> >> <property_URI> <property_value> .
Querying property graphs with a CONSTRUCT query is not supported at this time.

The following example query returns the properties that were defined in the INSERT DATA query above.

SELECT *
FROM <tickit>
WHERE {
  << ?person ?p ?likes_or_dislikes >> ?property ?value.
  FILTER(?p=<like> || ?p=<dislike>)
}
ORDER BY ?p
person  | p       | likes_or_dislikes | property | value
--------+---------+-------------------+----------+-------
person3 | dislike | jazz              | weight   |     9
person3 | dislike | opera             | weight   |    10
person3 | dislike | theatre           | weight   |     5
person3 | like    | rock              | weight   |     9
person3 | like    | musicals          | weight   |     4
person3 | like    | sports            | weight   |     8
6 rows

This example returns a list of the properties in the Tickit graph and lists the number of times each property is referenced in the graph:

SELECT ?property (COUNT(?property) AS ?times_used)
FROM <tickit>
WHERE {
  << ?s ?p ?o >> ?property ?value
}
GROUP BY ?property
ORDER BY desc(?times_used)
 property  | times_used
-----------+------------
 startDate |    1729764
 endDate   |     173036
 score     |         52
 weight    |          6
4 rows
Related Topics