Binding and Hierarchy Concepts

As part of the Graph Data Interface's (GDI) flexibility, there are multiple ways to express binding hierarchies in queries. This topic describes the options for expressing hierarchies.

Using Binding Trees and Selector Paths

One way to express hierarchies in queries is to use brackets ( [ ] ) to group objects into binding trees. For example, the WHERE clause snippet below organizes mapping variable objects into an hourly/data hierarchy by nesting the ?data patterns inside the ?hourly [ ] tree:

WHERE
{
  SERVICE <http://cambridgesemantics.com/services/DataToolkit>
    {
      ?data a s:HttpSource;
        s:url "https://sampleEndpoint.com/forecast/" ;
        ?latitude (xsd:double) ;
        ?longitude (xsd:double) ;
        ?timezone (xsd:string) ;
        ?hourly [
          ?data [
            ?time (xsd:long) ;
            ?summary (xsd:string) ;
            ?rainIntensity ("precipIntensity" xsd:double) ;
            ?rainProbability ("precipProbability" xsd:double) ;
            ?temperature (xsd:double) ;
            ?feelsLike ("apparentTemperature" xsd:double) ;
            ?humidity (xsd:double) ;
            ?pressure (xsd:double) ;
            ?windSpeed (xsd:double) ;
        ] ;
    ] .
  }
}

When constructing object binding trees, if you choose to introduce the hierarchy with a variable name that is not an exact match to the source label, include a selector property to list the value from the source. For example, in the WHERE clause snippet below, s:selector is included to select eventHeader in the source as ?event in the query and statLocation as ?location.

WHERE 
{
   SERVICE <http://cambridgesemantics.com/services/DataToolkit>
  {
      ?data a s:FileSource ;
      s:url "/mnt/data/json/part_1.json" ;
      ?event [
         s:selector "eventHeader" ;
           ?eventId (xsd:string) ;
           ?eventName (xsd:string) ;
           ?eventVersion (xsd:string) ;
           ?eventTime (xsd:dateTime) ;
      ] ;
      ?location [
         s:selector "statLocation" ;
           ?locationId (xsd:string) ;
           ?lineNo (xsd:int) ;
           ?statNo (xsd:int) ;
           ?statId (xsd:int) ;
      ] .
  }
}

As an alternative to grouping objects in binding trees, the selector property also supports using dot notation to specify paths. For example, the WHERE clause snippet below rewrites the first example query to express the same hourly/data hierarchy as a path in the s:selector value:

WHERE
{
  SERVICE <http://cambridgesemantics.com/services/DataToolkit>
    {
      ?data a s:HttpSource;
         s:url "https://sampleEndpoint.com/forecast/" ;
         ?latitude (xsd:double) ;
         ?longitude (xsd:double) ;
         ?timezone (xsd:string) ;
         s:selector: "hourly.data" ;
         ?time (xsd:long) ;
         ?summary (xsd:string) ;
         ?rainIntensity ("precipIntensity" xsd:double) ;
         ?rainProbability ("precipProbability" xsd:double) ;
         ?temperature (xsd:double) ;
         ?feelsLike ("apparentTemperature" xsd:double) ;
         ?humidity (xsd:double) ;
         ?pressure (xsd:double) ;
         ?windSpeed (xsd:double) .
  }
}

You can also include the $ character to anchor the selector at the root of the file. For example, s:selector "data" captures all data elements anywhere in the file. But s:selector "$.data" captures only the data elements that are at the root of the hierarchy.

Unpacking JSON with Bindings and Arrays

In addition to object binding trees and selectors, the GDI offers additional syntax for reading or ingesting JSON sources with nested objects and arrays. For example, following the JSON sample file below is a query that captures each value in the arrays:

{
   "payload" :
   {
      "IBP_IndEvent_MSR" :
      {
         "unit" : "ms",
         "value" : [ 0, 1 ]
      },
      "IBP_IndEvent_RMF" :
      {
         "unit" : "-",
         "value" : [ 0.012, 1.398, 3.1415 ]
      }
   }
}

To read the JSON file above, the following query uses an object binding (?values [ ]) to drill down to the value arrays in the source. An @ selector is specified in the ?value variable binding (?value ("@" xsd:double)) to retrieve each of the array values. For an array of primitive values, the @ selector captures each value in the array. If the source value was an array of objects, the @ selector would retrieve a JSON representation for each object in the array. In addition to creating a new binding context for the primitive array values, the ?values object binding also includes ?index ("!array::index") to capture the index array with the primitive value.

PREFIX s:   <http://cambridgesemantics.com/ontologies/DataToolkit#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT *
WHERE {
   SERVICE <http://cambridgesemantics.com/services/DataToolkit> {
      ?data a s:FileSource ;
      s:url "/mnt/data/json/array-index.json" ;
      s:selector "payload.*" ;
      ?unit (xsd:string) ;
      ?values [
         s:selector "value" ;
         ?value ("@" xsd:double) ;
         ?index ("!array::index") ;
      ] .
  }
}

The results of the query are shown below:

unit | value  | index
-----+--------+-------
ms   |      0 |     0
ms   |      1 |     1
-    |  0.012 |     0
-    |  1.398 |     1
-    | 3.1415 |     2

If you do not want to retrieve all of the values in an array, you can include the specific index number to retrieve instead of using the @ symbol. In the variable binding, the index number is appended in brackets ([ ]) to the binding column name. For example, the following variable binding retrieves the second index value (the third value in the array) from a "projects" array: ?project ("projects[2]"). The next example uses the following JSON file:

{
   "field1" : "value1" ,
   "arrayfield" : [
        "arrayvalue1",
        "arrayvalue2"
   ]
}

To retrieve only the second value in the array, the following query appends the index value 1 to the array column name, arrayfield:

PREFIX s: <http://cambridgesemantics.com/ontologies/DataToolkit#>
SELECT *
WHERE {
   SERVICE <http://cambridgesemantics.com/services/DataToolkit> {
       ?json a s:FileSource ;
       s:url "/mnt/data/json/array-index-2.json" ;
       ?field1 (xsd:string) ;
       ?arrayval ("arrayfield[1]" xsd:string) .
  }
}

The results of the query are shown below:

field1   | arrayval
---------+----------
value1   |arrayvalue2

Returning Hierarchies as JSON Strings

When working with schema-less sources, you can also capture a tree of data as a JSON string. For example, the query snippet below targets an HTTP endpoint. In this case, the properties under the hourly class of data are unknown. So the query binds all of the data below hourly to the ?hourly variable by including empty parentheses. As a result, the GDI returns a JSON string representation of all of the properties and instance data under hourly:

WHERE
{
  SERVICE <http://cambridgesemantics.com/services/DataToolkit>
    {
      ?data a s:HttpSource;
        s:url "https://sampleEndpoint.com/forecast/" ;
        ?latitude (xsd:double) ;
        ?longitude (xsd:double) ;
        ?timezone (xsd:string) ;
        ?hourly () .
   }
}  

For example, the results look like this:

 latitude  | longitude  | timezone        | hourly                                     
-----------+------------+-----------------+----------------------------
30.374563  | -97.975892 | America/Chicago | {"summary":"\"Humid and partly cloudy
throughout the day.\"","icon":"\"partly-cloudy-day\"","data":[{"time":"1595559600",
summary":"\"Clear\"","icon":"\"clear-night\"","precipIntensity":"0",
"precipProbability":"0","temperature":"88.39","apparentTemperature":"91.72",
"dewPoint":"67.42","humidity":"0.5","pressure":"1011.7","windSpeed":"7.48",
"windGust":"16.71","windBearing":"109","cloudCover":"0.06","uvIndex":"0",
"visibility":"10","ozone":"285.2"},{"time":"1595563200","summary":"\"Clear\"",
"icon":"\"clear-night\"","precipIntensity":"2.0E-4","precipProbability":"0.01",
"precipType":"\"rain\"","temperature":"86.69","apparentTemperature":"90.1",
"dewPoint":"67.84","humidity":"0.54","pressure":"1012","windSpeed":"7.05",
"windGust":"17.56","windBearing":"110","cloudCover":"0.12","uvIndex":"0",
"visibility":"10","ozone":"284.9"},...

Similar to the example above, you can write a query that specifically captures some of the properties in a hierarchy and then returns the rest of the properties and their values as a JSON string representation. To do so, use "@" as the binding path. For example:

WHERE
{
  SERVICE <http://cambridgesemantics.com/services/DataToolkit>
    {
      ?data a s:HttpSource;
      s:url "https://api.darksky.net/forecast/bdbe3f638eb908c9b94919537dad5945/30.374563,-97.975892" ;
      ?latitude (xsd:double) ;
      ?longitude (xsd:double) ;
      ?timezone (xsd:string) ;
      ?hourly [
        s:selector "hourly.data" ;
        ?time (xsd:long) ;
        ?summary (xsd:string) ;
        ?hourly_data ("@") ;
      ] .
   }
}

Sample results are shown below:

 latitude  | longitude  | timezone        | time       | summary          | hourly_data  
-----------+------------+-----------------+------------+------------------+---------------
 30.374563 | -97.975892 | America/Chicago | 1595559600 | Clear            | {"time":"1595559600","summary":"\"Clear\"",
"icon":"\"clear-night\"","precipIntensity":"0","precipProbability":"0","temperature":"88.39",
"apparentTemperature":"91.72","dewPoint":"67.42","humidity":"0.5","pressure":"1011.7","windSpeed":"7.48",
"windGust":"16.71","windBearing":"109","cloudCover":"0.06","uvIndex":"0","visibility":"10","ozone":"285.2"}
                                       
 30.374563 | -97.975892 | America/Chicago | 1595563200 | Clear            | {"time":"1595563200","summary":"\"Clear\"",
"icon":"\"clear-night\"","precipIntensity":"2.0E-4","precipProbability":"0.01","precipType":"\"rain\"","temperature":"86.69",
"apparentTemperature":"90.1","dewPoint":"67.84","humidity":"0.54","pressure":"1012","windSpeed":"7.05","windGust":"17.56",
"windBearing":"110","cloudCover":"0.12","uvIndex":"0","visibility":"10","ozone":"284.9"}
          
 30.374563 | -97.975892 | America/Chicago | 1595566800 | Partly Cloudy    | {"time":"1595566800","summary":"\"Partly Cloudy\"",
"icon":"\"partly-cloudy-night\"","precipIntensity":"3.0E-4","precipProbability":"0.01",
"precipType":"\"rain"","temperature":"85.63","apparentTemperature":"89.21",
"dewPoint":"68.33","humidity":"0.56","pressure":"1012.6","windSpeed":"6.48","windGust":"17.92","windBearing":"110",
"cloudCover":"0.34","uvIndex":"0","visibility":"10","ozone":"284.5"}
...