Introduction to the Graph Data Interface

The Graph Data Interface (GDI) is an extremely flexible and configurable plugin that enables users to access a variety of data sources via federated SPARQL queries. Depending on the type of query you write, i.e., whether it is an INSERT query against the GDI service or a CONSTRUCT query against the view or virtualized service, you can load source data into AnzoGraph DB or create a virtual graph that accesses the source only when it is needed without ingesting the data into AnzoGraph DB.

The GDI has built-in, native support for various file format types, HTTP/REST endpoints, and common database types. Internally, the GDI API has a records-oriented view of data. This view enables the GDI to bridge graph operations to operations for data in other formats. Though the GDI views the source as rows in a table, ultimately it has the capability to convert the records to graph format, enabling the data to be incorporated into data layers to augment existing data.

When you query a source such as a database, the GDI service leverages that source to retrieve only the data that it needs for the query. Unlike a JDBC driver, the GDI service does not need to retrieve all values and then complete an often time-consuming step to filter the results.

Supported Data Sources
Data Source Connections and Authentication

Supported Data Sources

This table below lists the data sources, file systems, and applications that the GDI supports.

Source	Description
HTTP/REST Endpoints	The GDI natively supports reading or ingesting data from HTTP/REST endpoints.
Databases	Cambridge Semantics supplies JDBC drivers for the following databases. For information about acquiring additional JDBC drivers for connecting to other databases, contact your Cambridge Semantics Customer Success manager. Databricks H2 IBM DB2 Microsoft SQL Server MariaDB Oracle PostgreSQL SAP Sybase (jTDS) Snowflake
File Formats	The following file types are supported: CSV and TSV JSON and NDJSON Parquet SAS (SAS Transport XPT and SAS7BDAT formats) XML Raw text format
File Systems	The following types of file storage systems are supported: Amazon S3 FTP & FTPS Google Cloud Storage HDFS (Kerberized HDFS is not supported at this time.) NFS SFTP WebDAV
Applications	Queries against Elasticsearch and Kafka applications are supported.

Data Source Connections and Authentication

When connecting to data sources, connection parameters like keys, tokens, and user credentials are provided as part of the query that you run against that source. To avoid including sensitive information in each request, however, AnzoGraph DB provides the option to create and manage Query Contexts. A context specifies all of the connection details for a source. Queries simply reference the context so that sensitive information is abstracted from the request. For more information about contexts, see Use a Query Context.