Quickstart Using the Command Line Interface

After deploying AnzoGraph, you can get a quick start to loading data and running SPARQL queries by using the AnzoGraph command line interface (CLI). This brief tutorial introduces you to the AnzoGraph command line interface (CLI) and gets you started with loading data and running SPARQL queries.

  1. Running the CLI
  2. Load and Query a Sample Data Set
  3. Load and Query Your Own Data

Running the CLI

The AnzoGraph command line client, azgi, uses SPARQL HTTPS protocol to interact with the database. The client exists in the azg/bin directory.

Running the following command displays the options that azgi supports:

./<install_path>/bin/azgi -help 

To get started, the following list describes the most frequently used azgi options:

  • azgi -c "command": Runs the command in the quotation marks. For example, the following command runs a SELECT query that returns the ?s ?p ?o triple patterns for a graph named graph:
    ./<install_path>/bin/azgi -c "select * from <graph> where {?s ?p ?o}"
  • azgi -f file.rq: Runs the query or queries in a file. For example, if a file called query.rq existed in the /tmp directory, the following command would run the query or queries in query.rq:
    ./<install_path>/bin/azgi -f /tmp/query.rq

This tutorial guides you through using the azgi -c option to load data. For more information about the command line interface, see Using the AnzoGraph CLI.

Load and Query a Sample Data Set

The AnzoGraph installation includes two sample data sets on the file system:

  • Tickit: This data set includes about 5.5 million triples and captures sales activity for the fictional Tickit website where people buy and sell tickets for sporting events, shows, and concerts. The data consists of person, venue, category, date, event, listing, and sales information.
  • TPCH: This data set includes about 111 million triples and is similar to the SQL TPC-H Decision Support data, which Cambridge Semantics converted to the RDF graph model. The data set and queries model a use case where a vendors purchase parts from suppliers and sell them to customers.

The instructions below guide you through using the CLI to load the Tickit data set and run some sample queries.

  1. Run the following command to load the Tickit Turtle files that are hosted on Amazon S3 into a graph named tickit:
    ./azg/bin/azgi -c "load <s3://csi-notebook-datasets/MovieTicketAnalysis/20190217/tickit.ttl.gz>
    into graph <tickit>"

    AnzoGraph loads the data from the files into memory, and the prompt returns when the load completes.

  2. Run the following command to count the number of triples in the tickit graph:
    ./azg/bin/azgi -c "select (count(*) as ?number) from <tickit> where {?s ?p ?o}"
    number
    ---------
    5525739
    1 rows

If you want to continue to work with the Tickit data set and run more complex queries or view explanations of the query syntax, see Working with the Tickit Data.

Load and Query Your Own Data

AnzoGraph supports loading data from Turtle (.ttl), N-Triple (.n3 or .nt), N-Quad (.nq), TriG (.trig), and CSV files. You can load files that are hosted in the cloud, such as Amazon S3 or Google Cloud Platform, the local or a mounted file system on the AnzoGraph server, or a web server.

This section provides guidance on quickly loading and analyzing some of your own data by copying a file to the local file system and using the AnzoGraph CLI to load it. Follow these steps to load a file.

  1. Copy a load file to the file system on the AnzoGraph host server.
  2. Run the following command to load the file's contents:
    ./azg/bin/azgi -c "load <file:/path_to_file/filename> into graph <graph_name>"

    AnzoGraph loads the data from the file into memory, and the prompt returns when the load completes.

  3. Run the following command to count the number of triples in the new graph. Replace graph_name with the graph name that you typed in the previous step.
    ./azg/bin/azgi -c "select (count(*) as ?number) from <graph_name> where {?s ?p ?o}"
  4. If you want to run an additional query to explore the data, such as to list the predicates in the data set, run the following command. Replace graph_name with the graph name that you typed in the previous step.
    ./azg/bin/azgi -c "select distinct ?p from <graph_name> where {?s ?p ?o} limit 100"

For more detailed information about loading data and to review load file requirements and recommendations, see Loading Data from Files.

Related Topics