Configuring a Query Step
This topic provides guidance on configuring a Query Step that you can use for creating, cleaning, conforming, or transforming the data in a Data Layer. The sections below describe each of the tabs and configuration options that are available when you create or edit a Query Step.
Details
The Details tab includes options such as the name of the Step, the source data to act upon, and the Data Model to use with the query.
Title
The required name of the Step.
Description
And optional description of the Step.
Enabled
When creating a new Step, the Enabled option is selected by default, indicating that the Step is enabled and will run when the Data Layer is loaded or refreshed. If you want to disable the Step so that it is not processed, clear the Enabled checkbox.
Source
The Source is the source data that this Step should act upon. Steps can build upon the data generated by Steps in other Data Layers or can be self-contained, applying changes that relate only to the data defined in the Layer that contains this Step. You can select any number of the following options:
- Self: This option is selected by default and means that the Query runs against only the data that is generated in the Layer this Step belongs to.
- All Previous Layers Within Graphmart: This option means that the Query runs against the data that is generated by all of the successful Layers in the Graphmart that precede the Layer this step is in. Any failed Layers are ignored.
- Previous Layer Within Graphmart: This option means that the Query runs against only the data that is generated by the one Layer that precedes the Layer this Step is in.
- Layer Name: The Source drop-down list also includes options for specific Layer names. You can choose a specific Layer to run the Query against that Layer only.
Data Models
This required field specifies the Data Model or Models that you want to create this query template against. The list displays all of the Models for all of the Datasets in the Dataset catalog. By default, the field is set to Exclude System Data (). If you want to choose a system Model, click the toggle button on the right side of the field to change it to Include System Data (). The Data Models drop-down list will display the system Models in addition to the Dataset Models.
Pre-Run Generate Statistics
This option controls whether to initiate AnzoGraph's internal statistics gathering queries before running this Step. The statistics gathering helps ensure that the AnzoGraph query planner generates ideal query execution plans for queries that are run against the Graphmart.
Query
The Query tab contains the query that you want this Step to run.
The template includes the syntax for writing SPARQL INSERT and DELETE queries and includes the target and source graph parameters (${targetGraph}
and ${usingSources}
). Anzo replaces the parameters with the appropriate URIs when the Step runs. Edit the template as needed. You can click the Preview in Query Builder button to open the query in the Query Builder, where you can perform practice runs to see results without having to refresh the Graphmart or Data Layer.
See SPARQL Query Templates and Best Practices for guidance on writing SPARQL queries. For information about the SPARQL syntax for INSERT and DELETE queries, see SPARQL 1.1 Update Language in the W3C SPARQL 1.1 Update specification.
Execution Condition
If you want this Step to be executed conditionally, based on the result of a specified Validation Condition, you can configure an Execution Condition on the Execution Condition tab that is available when creating or editing a Step. The image below shows the Execution Condition tab.
In order to set up an Execution Condition, the Graphmart needs to have at least one Validation Step that defines a Condition Variable. Condition Variables can be used across all Data Layers in the Graphmart. For guidance on configuring a Validation Step, see Configuring a Validation Step.
Enable Layer Based on Boolean Condition
This setting indicates whether to enable this Step only if the returned value from the Validation Condition is either true or false. You specify true or false in the Conditional Variable If Result field. If the Validation Condition fails, the Step is disabled.
Conditional Variable
This field specifies the variable that you want to base this Execution Condition on. The variable is the result of a Validation Step Query in the Graphmart.
Conditional Variable If Result
If you enabled the Enable Layer Based on Boolean Condition setting, select true or false from the drop-down list. The Step will be enabled only if the result of the Validation Step Query matches the value that you specified. If Enable Layer Based on Boolean Condition is disabled, leave this field blank.
Context
When you use the Graph Data Interface (GDI) for Data Virtualization, you may connect to Data Sources that require input of sensitive connection and authorization information such as keys, tokens, and user credentials. The Context tab gives you the option to configure a Context to store the sensitive information as key-value pairs. Queries can then reference the keys from the Context so that the sensitive details are abstracted from any requests that are sent to the Data Source.
Context Providers
Connections in Anzo implement the Context Provider interface. For example, File Store connections, Anzo Data Store connections, and Data Source connections provide contexts (in the form of JSON objects) that contain key-value pairs which define connection details such as URLs, database names, usernames and passwords, tokens, etc. These contexts are passed to the Data Source when a request is made against that source. To use one of the Anzo-generated Context Providers that was created for a pre-existing connection, select that provider from the drop-down list.
If you specify a Context Provider, the key-value pairs from the selected provider are not populated in the Context Key list on the screen. However, the keys are used automatically when a query is run against that provider.
Context Keys
Context Keys are user-defined key-value pairs that are not associated with a particular Context Provider. To add a key and define its value, click the Add Key button. Then specify the Key Name and Key Value in the Create Context Key dialog box. Click Create to add the key-value pair to the Context.
The image below, for example, creates URL, username, and password Context Keys.
The format that you use for referencing a Context Key in a query depends on the type of AnzoGraph plugin or extension that is being called by the query. Generally, Contexts are only used in Steps that contain Graph Data Interface (GDI) queries. When referencing Context Keys in GDI queries, use the following format:
{{@<context_key_name>}}