LOAD

The SPARQL LOAD statement is used to load data to AnzoGraph DB from files that are in Turtle, N-Triple, N-Quad, or TriG format.

For information about load file directory requirements and load architecture, see RDF Load File Requirements. For information on the data types that AnzoGraph DB uses to store data, see Data Type Handling.

LOAD Syntax

Run the following statement to load data from Turtle, N-Triple, N-Quad, or TriG files.

LOAD [ SILENT ] [ WITH 'leader' | 'compute' | 'global' ] <URI> [...<URIn>] [ INTO GRAPH <graph_uri> ]

SILENT

Include this optional keyword if you want AnzoGraph DB to ignore "bad data" errors during the load. Data issues are problems such as dateTime values that are incorrectly formatted or strings that are tagged as double data types. The SILENT keyword does not silence syntax errors in the files. If a file is ill-formed, such as it includes invalid characters in place of URIs, AnzoGraph DB cannot parse the data and the file must be corrected.

  • When SILENT is omitted, AnzoGraph DB aborts the load upon hitting a data or syntax error and reports the error to the client.
  • When SILENT is specified and AnzoGraph DB encounters an error with the data, it logs the error to a graph and proceeds with the load. By default, any errors are captured in the <load_errors> graph. After a load completes, you can query the graph to review errors.

    When SILENT is specified, the load will still be aborted if there are syntax errors in the files. AnzoGraph DB cannot parse the data if there are syntax errors. The file or files must be corrected and loaded again.

    To customize the load error graph URI, you can change the load_errors_graph setting value in the system configuration file, <install_path>/config/settings.conf. See Changing System Settings for instructions.

leader

Include the optional WITH 'leader' clause when loading files that only the leader server can access. WITH 'leader' is the default value for the LOAD statement. When the WITH clause is omitted, the load proceeds as if WITH 'leader' was specified.

The "leader" keyword is case-sensitive. Type the term using lower case letters.

compute

Include the optional WITH 'compute' clause when all servers will load files from their local file systems. Use this option if you have arranged the load files so that each AnzoGraph DB server has a unique subset of files on its local file system.

The "compute" keyword is case-sensitive. Type the term using lower case letters.

global

Include the optional WITH 'global' clause when all servers will load a subset of the same files directories on a mounted file system. Include this option when every AnzoGraph DB server in the cluster has visibility to the entire data set. AnzoGraph DB automatically divides file selection among the servers.

The "global" keyword is case-sensitive. Type the term using lower case letters.

URI

Required clause that specifies the absolute path to the load file or files. To load a single file, the scheme of the URI should be file:. To load a directory of files, the scheme of the URI should be dir:. When loading a directory, make sure the directory name includes the same file type extension as the files in the directory, i.e., a directory of TTL files is named name.ttl, a directory of TriG files is named name.trig directory, and a directory of NQ files is named name.nq. When you specify a directory, AnzoGraph DB loads all valid files in that directory as well as any subdirectories. AnzoGraph DB does not load any hidden files that are named with a leading period, such as .file.ttl.

If you specify more than one URI to load from, each URI must target the same file type, such as .ttl, .trig, etc. Also each URI must specify the same scheme, file: or dir:.

For example, the following URI loads a single file from a shared directory:

<file:/shared-files/data/tickit.ttl>

This example URI loads a directory of .ttl.gz files:

<dir:/global/nfs/vpc_nfs_server/data/tickit_all.ttl.gz>

And this example URI statement loads multiple directories of .ttl.gz files:

<dir:/global/nfs/data/tickit_all.ttl.gz> <dir:/global/nfs/data/movies.ttl.gz>

INTO GRAPH <graph_uri>

When loading files such as Turtle or N-Triple files without graph specifications, include this optional clause to specify the graph to load data into. If the graph does not exist, the system automatically creates it and then loads the data into it. If you do not specify a graph, AnzoGraph DB loads data into the default graph.

You can also include the INTO GRAPH option when loading N-Quad files. If the N-Quad files contain a mixture of quads and triples, AnzoGraph DB loads the triples into the specified graph. Quads are still loaded according to their graph specification. If you omit this option for N-Quad files, any triples without graph specifications are loaded into the default graph.