Glossary

This topic defines commonly used Graph Studio terms and phrases.

Artifacts
Blend
Data Layers
Dataset Template
Datasets Catalog
ELT
File-Based Linked Data Set
File Store
Frame Graph
Graph Data Interface
Graphmarts
Hi-Res Analytics

IRI
Journal or Volume
Linked Data Set
Model
Named Graph
NLP
OData
OSGi
Onboard
Registry
URI

Term	Definition
Artifacts	Artifacts are all of the objects that are created in Graph Studio during initial configuration and the data onboarding process. For example, when you connect to a database or file source, those connections are stored as artifacts, and when the data from a data source is ingested, the resulting schema, model, graphmart, and any generated datasets are also artifacts.
Blend	Using semantic models, separate data sets or new data sources can be blended into a knowledge graph. Graph Studio can combine and align any data set as well as apply data cleansing and/or transformation steps. Graph Studio delivers blending and access through Graphmarts, which give users the flexibility to combine and analyze any subset of data in Graph Studio.
Data Layers	Data layers enable users to enhance graphmarts dynamically. Users can create layers to load additional data sets, clean, conform, or transform data, infer new information, or export data to a file-based linked data set (FLDS).
Dataset Template	A dataset template defines an endpoint for writing data. It specifies the directory on the shared file store where Graph Studio can generate file-based linked data sets for unstructured pipelines (see File-Based Linked Data Set). It can also be used as a designated a directory to use for saving datasets created from graphmart exports. A dataset template also defines write properties such as the maximum file size and whether files should be compressed.
Datasets Catalog	Graph Studio’s Dataset catalog combines traditional technical, operational, and business metadata with a semantic layer to describe all aspects of enterprise data elements. The catalog enables Graph Studio’s unique use of semantics and graph models and is the system of record for data in Graph Studio. Graph Studio collects and generates metadata at every stage in the data discovery and integration process. Metadata in the catalog documents how data is converted during the onboarding process from its original format into a graph model. Subsequent data blending, transformation, and preparation steps are captured as additional metadata. Graph Studio also captures new metadata to describe all actions taken against data within Graph Studio. The metadata enables users to visualize their data, understand business contexts, identify connections, and blend and prepare data.
ELT	Structured and semi-structured data sources are onboarded using an extract, load, and transform (ELT) workflow as opposed to a traditional ETL flow. Data Layers are Graph Studio's mechanism for flexibly transforming data in memory.
File-Based Linked Data Set	A File-Based Linked Data Set (FLDS) is a Named Graph that contains a collection of ontologies and the location of RDF data files that share common structure, purpose, meaning, or permissions. When the unstructured pipeline workflow is used to onboard data, Graph Studio creates a dataset in the Datasets catalog. The dataset in the catalog is registered in the Graph Studio system data source (see Journal or Volume) and includes metadata about the data, including a pointer to the data store location for the RDF files generated by the pipeline. The catalog dataset and the files on disk are an FLDS. An FLDS is also generated when an Export Step is included in a graphmart.
File Store	The Graph Studio platform components, Graph Lakehouse, Graph Studio Unstructured, and Elasticsearch share a file system for maintaining onboarded data and supporting files. A file store is the shared file storage system, such as NFS, HDFS, or cloud storage, that is shared between the servers.
Frame Graph	Each ontology in Graph Studio has a corresponding frame ontology or frame graph. A frame graph is generated when a new ontology is added, and it is regenerated each time the ontology is modified. By generating a frame graph, Graph Studio can pre-process the ontology rather than waiting until runtime to do the calculations. During frame graph generation Graph Studio performs activities like finding all of the properties available to a class, identifying the superclasses of each class, and determining whether a property is required or multi-valued.
Graph Data Interface	The Graph Data Interface (GDI) (sometimes called the Data Toolkit) is a flexible Graph Lakehouse extension that enables users to access a variety of data sources via SPARQL queries. The GDI has built-in, native support for various file format types, HTTP/REST endpoints, and JDBC connections to common database sources. For more information about the GDI, see Introduction to the GDI.
Graphmarts	Graphmarts are collections of knowledge graphs that users can blend and enhance. Graphmarts can combine any subset of data in Graph Studio for analysis. For more information about graphmarts, see Graphmart Concepts.
Hi-Res Analytics	Hi-Res Analytics enable users to explore and ask questions across all of their data. Using model-guided dashboards, users can perform computations across multi-dimensional data. Hi-Res Analytics dashboards generate complex graph queries dynamically based on user input.
IRI	An Internationalized Resource Identifier (IRI) is similar to URI but allows a greater range of characters. URI and IRI are often used interchangeably.
Journal or Volume	A journal, also known as a volume, refers to data that is stored in Graph Studio's embedded graph store. The graph store is transactional and is used to persist metadata, which is written to disk in a .jnl file. The system volume (or system data source) is the default, required volume where Graph Studio stores ontologies as well as system configuration, dataset, catalog, registry, and access control metadata. Users can create secondary local volumes that are used for more compartmentalized data and can be created and deleted without affecting the core system.
Linked Data Set	A linked data set (LDS) is a fundamental concept. Graph Studio organizes all data, including system data, into linked data sets. An LDS is associated with a data model and can be searched, discovered, shared, and protected with access control. For example, graphmarts are organized in a linked data set or registry of graphmarts, the Activity Log is a linked data set, data source configurations exist in a linked data set, and so on.
Model	Graph Studio establishes the semantic layer by enabling users to convert diverse enterprise data models into graph data models and then enhance the data by adding new business definitions, names, and tags. Further insight is added when data from separate sources are linked, connecting shared business definitions across previously siloed sources. Graph Studio employs open World Wide Web Consortium (W3C) standards, including Web Ontology Language (OWL), RDF, and SPARQL to model, connect, and query interconnected graphs. For more information, see Model Concepts.
Named Graph	Graph Studio implements the RDF Named Graph abstraction. These are the atomic units of storage in Graph Studio. Each named graph can be access controlled, and each graph has a corresponding “metadata graph” that includes the access control information, the last modified date, and which user created and modified the associated named graph. For more information about named graph storage, see Graph Storage Concepts.
NLP	Graph Studio performs named-entity recognition (NER) using knowledge bases and can interface with natural language processing (NLP) tools. It serves as a platform that enables text analytics through interplay between best-of-breed NLP tools.
OData	Open Data Protocol (OData) facilitates the creation of interoperable RESTful APIs. The Graph Studio Data on Demand service provides OData-based feeds that can be used to query graphmart data from third-party business intelligence tools.
OSGi	The Open Service Gateway Initiative (OSGi) is the open-standard architecture upon which Graph Studio is built. It is a Java framework for developing and deploying software programs and libraries. OSGi enables Cambridge Semantics to compartmentalize Graph Studio into "bundles" that can be deployed, activated, and removed independently without affecting other bundles in the system.
Onboard	When data is ingested from its source platform to Graph Studio, it is converted from its original format to a new format that describes the data as a graph data model. This format, Resource Description Framework (RDF), captures each data value and relationship. RDF data is loaded to Graph Studio’s in-memory graph engine, Graph Lakehouse for transformation and analysis.
Registry	Graph Studio manages configurations in system-level registries. Each registry is a collection of application and system component configurations of the same type. Like data, registries are stored and managed with RDF named graphs according to ontologies. Technically, a registry is a Linked Data Set.
URI	A Uniform Resource Identifier (URI) is a globally unique identifier for a piece of information. A URL (Uniform Resource Locator) is a URI that specifies a location, such as a web address.