Unloading Data

This topic provides instructions for unloading data from the database to compressed Turtle files (.ttl.gz) on disk.

Note: There are various other SPARQL functions you can use to insert, update, and delete (drop) graph data and triples already loaded into AnzoGraph. See Update Operations for more information on those operations.

COPY Syntax

Use the following syntax to copy graphs from AnzoGraph to a file or files. The list below the syntax provides details about the options:

COPY graph1 [ graph2 graph3 ... ] TO <dir|file:/path/dirname.ttl.gz>

Where graph is the URI for each of the graphs that you want to unload.
In the URI for the file path, specify dir if you want AnzoGraph to copy the graph or graphs to several smaller files or file if you want to copy the data into a single file. Specify a dirname that does not exist. AnzoGraph creates the directory.
The directory name must end in .ttl.gz. Do not include a slash at the end of the directory name. For example, <dir:/tmp/rdf.ttl.gz> is valid, and <dir:/tmp/rdf.ttl.gz/> is invalid.

For example, the following command unloads data from two graphs named flights and airports to a flight-data.ttl.gz directory in the user's home directory.

COPY <flights> <airports> TO <dir:/home/user/flight-data.ttl.gz>

By default, AnzoGraph creates 5 MB .ttl.gz files in the specified directory. On a cluster, each node unloads a subset of the data. You can retrieve the files from the same location on each node. To configure AnzoGraph to create a different file size, you can change the settings file, settings.conf, to add copy_file_size=number_of_MB to the file. For instructions on changing settings, see Changing System Settings.

The example below unloads data from the flights graph to a single flights.ttl.gz file in the user's home directory.

COPY <flights> TO <file:/home/user/flights.ttl.gz>

On a cluster, the flights.ttl.gz file is created on the leader node.