Copy Graphs to Files

This topic provides instructions for using the COPY command to copy graphs from AnzoGraph DB to compressed or uncompressed files on disk. You can copy a graph to a file if you added or updated the data in a graph and want to be able to load that updated graph into another AnzoGraph DB instance. Or, you may want to create a backup to restore data to a previous state after upgrading or installing a new version of AnzoGraph DB.

By default, when you restart AnzoGraph DB, it automatically reloads the last state of graph data from that stored in the <install_path>/persistence directory.

Copying graph data to a file or directory does not remove the copied data from AnzoGraph DB.

COPY Syntax

Copy data to files by running the following SPARQL query. Each of the options are described below.

COPY ALL | graph_uri_list TO <single_file_uri> | <directory_uri>
Argument Description
ALL Include the ALL keyword if you want to copy all graphs to files rather than listing specific graphs. If you do not want to copy all graphs, specify a graph_uri_list.
graph_uri_list Use the format below if you want to copy a single graph or a list of graphs. Separate multiple graphs with a space.
<graph_URI> [ <graph2_URI> <graphN_URI> ... ]
single_file_uri If you want to copy a graph or graphs to a single file, specify a file location URI in the format below. When generating a single file on a cluster, the leader node writes the file.
<file:/path/filename.filetype[.gz]>

Where filetype is the file format to generate. Supported types are .ttl, .n3, .nt, .nq, .quads, and .trig. If you want to compress the files, include the .gz suffix.

When copying from multiple graphs, make sure that you specify a quad format such as .nq, .quads, or .trig to preserve the graph name information in the data.

directory_uri If you want to copy a graph or graphs to many smaller files, specify a directory location URI in the format below. When generating a directory of multiple files on a cluster, each node creates files that contain the data that is stored in its slices. It is important to choose a directory location that is shared between the nodes in the cluster. Otherwise you have to retrieve the files from each node separately.
<dir:/path/dirname.filetype[.gz]>

Where filetype is the file format to generate. Supported types are .ttl, .n3, .nt, .nq, .quads, and .trig. If you want to compress each of the files in the directory, include the .gz suffix.

When copying from multiple graphs, make sure that you specify a quad format such as .nq, .quads, or .trig to preserve the graph name information in the data.

By default, AnzoGraph DB creates 5 MB .gz files in the specified directory. To configure AnzoGraph DB to create a different file size, you can change the settings file, settings.conf, to add copy_file_size=<number_of_MB> to the file. For instructions on changing settings, see Change System Settings.

COPY Examples

The example below copies data from the flights graph to a single flights.ttl.gz file on a shared file system.

COPY <http://anzograph.com/flights> TO <file:/mnt/shared/data/flights.ttl.gz>

The example below copies data from two graphs, flights and airports, to a flight-data.trig.gz directory on a shared file system. Using .trig format ensures that the graph names are included in the files.

COPY <http://anzograph.com/flights> <http://anzograph.com/airports> TO <dir:/mnt/shared/data/flight-data.trig.gz>

The example below copies the data from all graphs to a directory on a shared file system:

COPY ALL TO <dir:/mnt/shared/data/allgraphs.trig.gz>