SPARQL Query Templates
This topic provides templates that you can use as a starting point for writing SPARQL queries. The templates are based on the best practices described in SPARQL Best Practices.
- Basic Data Selection
- Graph Traversal Data Selection
- Text Cleanup with REGEX
- Data Aggregation
- Applying a Filter to Selected Data
- Creating or Deriving New Variables
Basic Data Selection
The most fundamental use case for writing SPARQL queries is to select data from properties from a collection of instances. The following template and example query illustrate how to access a class in a model and return the properties on that class using their URIs.
Abstracted Query Template
Replace the bold text to modify the query.
PREFIX uriRoot: <http://example.com/rootOfUris#> # select the variables that are populated in the WHERE clause SELECT ?var1 ?var2 WHERE { ?instanceOfClass a uriRoot:ClassName ; uriRoot:varName1 ?var1 ; # use a prefix to abbreviate a property URI as shown above # or use the full URI as shown below <http://example.com/rootOfUris#varName2> ?var2 . }
Example: Get the sample ID and anatomical location for each sample
PREFIX bm: <http://identifiers.csi.com/pharmakg/def/biomarker#> SELECT ?sampleId ?anatomicalLocation WHERE { ?sample a bm:Sample ; bm:sampleId ?sampleId ; <http://identifiers.csi.com/pharmakg/def/biomarker#fmi_anatomicalLocation> ?anatomicalLocation . }
Graph Traversal Data Selection
The graph model enables the flexibility to combine data from different classes. The following template illustrates how to traverse between classes in the data model and access data from properties on multiple classes.
Abstracted Query Template
Replace the bold text to modify the query.
PREFIX uriRoot: <http://example.com/rootOfUris#> # select the variables that are populated in the WHERE clause SELECT ?var1 ?var2 ?varFromOtherClass WHERE { ?instanceOfClass a uriRoot:ClassName ; uriRoot:varName1 ?var1 ; # use a prefix to abbreviate a property URI as shown above # or use the full URI as shown below <http://example.com/rootOfUris#varName2> ?var2 ; # getting data from other classes requires traversing per the model uriRoot:pointerToOtherClass ?instanceOfOtherClass . ?instanceOfOtherClass a uriRoot:OtherClassName ; uriRoot:varName3 ?varFromOtherClass . }
Text Cleanup with REGEX
Once data is onboarded to Anzo, it is common to encounter strings that include issues such as unintended characters, missing spaces, and inconsistent formatting. You can use regular expressions in a data layer query to manipulate those values so that they are consistent and readable in analytics against the graphmart.
The BIND clause in the query below trims any white space from before and after the string, converts the characters to upper case, and removes all non-alphanumeric characters and non-spaces. Replace the bold text as needed:
PREFIX : <http://csi.com/> DELETE { GRAPH ${targetGraph}{ ?s ?pred ?old_val } } INSERT { GRAPH ${targetGraph}{ ?s ?pred ?new_val } } ${usingSources} WHERE { ?s a :Class ; ?pred ?old_val . VALUES (?pred) { (:property) } BIND(TRIM(UPPER(REPLACE(?val, "[^a-zA-Z0-9[[:space:]]", ""))) as ?new_val) }
Data Aggregation
Grouping data selections around a central property yields a more complete representation or summary of the data available. The following template illustrates how to use one property to act as a pivot point for collecting all the data from another property.
Abstracted Query Template
Replace the bold text to modify the query
PREFIX pref: <http://example.com/rootOfUris#> SELECT # data can be aggregated to yield counts, concatenations of data, etc. ?instanceId GROUP_CONCAT(DISTINCT(?instanceDetail) as ?instanceDetails) WHERE { # apply selection/filtering logic to narrow the aggregation # or get summaries of total data by applying only simple restrictions ?instance a pref:Class ; pref:instanceId ?instanceId ; pref:instanceDetail ?instanceDetail . } GROUP BY ?instanceId # all non-aggregated variables must be grouped in GROUP BY
Applying a Filter to Selected Data
Filtering the results for a query gives the ability to focus on specific aspects of the data. The following template illustrates how to restrict the total selected result set by including a filter on a variable.
Abstracted Query Template
Replace the bold text to modify the query.
PREFIX pref1: <http://example.com/rootOfUris1#> PREFIX pref2: <http://example.com/rootOfUris2#> SELECT ?varFromClass1 ?varFromClass2 ?varFromClass3 ?filteredVar WHERE { ?instance1 a pref1:Class1 ; pref1:varName1 ?varFromClass1 ; # the path on the model points from Class1 to Class2 pref1:pointerToClass2 ?instance2 . ?instance2 a pref1:Class2 ; pref1:varName2 ?varFromClass2 . # models with different prefixes can still be joined ?instance3 a pref2:Class3 ; # the path on the model points from Class3 to Class2 pref2:pointerToClass2 ?instance2 ; pref2:filteredVarName ?filteredVar . # filters use comparisons to scope the selected data # they can use existence checks or other boolean expressions as well FILTER(?filteredVar = 'COMPAREDDATA') }
For optimal query performance, replace FILTER clauses. See Replace FILTER with VALUES or Triple Patterns for more information.
Creating or Deriving New Variables
Storing intermediate or derived data within a query enables a single query to answer more complex questions. The following template illustrates how to bind a derived value to a variable. That variable is then available for selection or further manipulation.
Abstracted Query Template
Replace the bold text to modify the query.
PREFIX pref1: <http://example.com/rootOfUris1#> PREFIX pref2: <http://example.com/rootOfUris2#> PREFIX pref3: <http://example.com/rootOfUris3#> SELECT ?var1 ?filterVar ?var2AndVar3 WHERE { ?instance1 a pref1:Class1 ; pref1:varName1 ?var1 . ?filterInstance a pref2:MedicalHistory ; pref2:filterVarName ?filterVar ; # multiple traversals between classes may be necessary to link appropriate data pref2:pointerToIntermediateClass ?intermediateInstance . ?intermediateInstance a pref2:IntermediateClass ; pref2:pointerToClass1 ?instance1 . ?instance2 a pref3:Class2 ; # forwards traversals tend to be more performant # it is still possible to identify a latter class and do a backwards traversal pref3:pointerToClass1 ?instance1 ; pref3:varName2 ?var2 . ?instance3 a pref3:Class3 ; pref3:pointerToClass2 ?instance2 ; pref3:varName3 ?var3 . # filters can be executed on various data types FILTER(?filterVar < "filterData"^^xsd:filterDataType) # binding allows population of new/derived variables BIND(CONCAT(?var2, "--", ?var3) as ?var2AndVar3) }