Load RDF Files with the IO Load Service
This topic provides instructions for loading locally- or remotely-stored RDF files to Graph Lakehouse using the IO Load service. See RDF Load File Requirements for details about the supported file types, encryption, storage systems, and the directory naming requirements.
For instructions on loading files with SPARQL LOAD queries, see Load Local RDF Files with SPARQL LOAD.
Load Service Query Syntax
The following query syntax shows the structure of a load service query. The clauses, patterns, and placeholders that are links are described below.
PREFIX io: <http://cambridgesemantics.com/anzograph/io#> INSERT { [ GRAPH <graph_name> { ] ?sub ?pred ?obj [ } ] } WHERE { { SELECT ?sub ?pred ?obj . WHERE { SERVICE io:load('<protocol://path_to_files[,protocol://path_to_files][,...]>'){} . } } }
Option | Description |
---|---|
GRAPH <graph_name> | This clause is optional. When loading files such as Turtle or N-Triple files without graph specifications, include this optional clause to specify the graph to load data into. If the graph does not exist, Graph Lakehouse automatically creates it and then loads the data into it. If you do not specify a graph, the data is loaded to the default graph. You can also include the GRAPH clause when loading quad files. If the quad files contain a mixture of quads and triples, Graph Lakehouse loads the triples into the specified graph. Quads are still loaded according to their graph specification. If you omit this option for quad files, any triples without graph specifications are loaded into the default graph. |
?sub ?pred ?obj | This triple pattern is required and the variable names must be ?sub ?pred ?obj . The WHERE clause requires a subquery that selects the same triple pattern. |
SERVICE io:load | This is the required call to the IO load service. If your query omits the PREFIX clause, include the full URI in the call: SERVICE <http://cambridgesemantics.com/anzograph/io#load> . |
protocol | The service call includes a URI that specifies the load protocol to use and the path to the load file or directory of files. The protocol that you specify depends on the type of file system that hosts the files:
|
path_to_files | After the protocol in the service call URI, specify server connection details, if necessary, and the path to the load file or directory of files. When loading a directory of files, make sure the directory name includes the same file type extension as the files in the directory (see Directory Name Requirements for more information). Graph Lakehouse loads all valid files in that directory as well as any subdirectories. Hidden files that are named with a leading period, such as |
Protocol and Path Examples
The following example URI, loads a directory of compressed TTL files from Amazon S3:
<s3://shared-data/load-files/emr.ttl.gz>
The example below connects to an NFS that is not mounted and loads a single NT file:
<nfs://10.10.100.10/shared-data/load-files/rdf/sales-2022.nt>
This example loads a TTL file from a Google object store and another TTL file from a web server:
<gs://shared-data/load-files/emr-data.ttl/patients.ttl,https://10.30.103.3/emr/medications.ttl>
The following two examples load a directory of compressed TTL files from the Graph Lakehouse file system. The second example omits the file://
protocol since it is optional:
<file:///opt/data/airlines/airline-data.ttl.gz>
</opt/data/airlines/airline-data.ttl.gz>
Load Service Query Examples
The example query below loads a directory of compressed TTL files from an Azure blob store:
PREFIX io: <http://cambridgesemantics.com/anzograph/io#> INSERT { GRAPH <http://anzograph.com/emr> { ?sub ?pred ?obj } } WHERE { { SELECT ?sub ?pred ?obj . WHERE { SERVICE io:load('<az://shared-data/load-files/emr.ttl.gz>'){} . } } }
This query loads a directory of compressed N3 files from Amazon S3:
PREFIX io: <http://cambridgesemantics.com/anzograph/io#> INSERT { GRAPH <http://anzograph.com/sales> { ?sub ?pred ?obj } } WHERE { { SELECT ?sub ?pred ?obj . WHERE { SERVICE io:load('<s3://shared-data/load-files/sales.ttl.n3>'){} . } } }