Complete the Pre-Installation Configuration

Before deploying AnzoGraph, follow the instructions below to install the required software packages on each AnzoGraph host server. In addition to listing the required and optional software dependencies, this topic also includes important information about Linux proxy variables, ensuring that AnzoGraph is installed as the appropriate user, and recording the cluster IP addresses that are needed during the install process.

For information about host server hardware and firewall requirements, see AnzoGraph Requirements.

Install GNU Compiler Collection (GCC)

All AnzoGraph servers are required to include the latest version of the GCC tools for your operating system. On all servers in the cluster, run the following command to install GCC:

sudo yum install gcc

Specifically, AnzoGraph requires the glibc, glibc-devel, and gcc-c++ libraries. Typically, when you install GCC by running yum install gcc, those libraries are included as part of the package. In rare cases, depending on the host server configuration, installing GCC excludes certain libraries. If AnzoGraph fails to start and you receive a "Compilation failed" message, it may indicate that some of the required libraries are missing. To install the missing libraries, run the following command:

sudo yum install glibc glibc-devel gcc-c++

Install OpenJDK 11

AnzoGraph uses a Java client interface, called the Graph Data Interface (GDI), to access Data Sources when you profile a source, ingest Data Sources via automated Graphmarts, or blend data into a Graphmart via manually created queries. AnzoGraph also uses the GDI to communicate with Elasticsearch when Anzo Unstructured Graphmarts are activated. Java Development Kit version 11 is required for using the GDI. Follow the instructions below to install OpenJDK on all servers in the cluster.

  1. Run the following command to install OpenJDK 11:
    sudo yum install java-11-openjdk

    You do not need to set the $JAVA_HOME variable to use the JDK installation. AnzoGraph's system management daemon (azgmgrd) requires JAVA_HOME, and it is set as part of the post-installation configuration (Complete the Post-Installation Configuration).

  2. If your organization uses Anzo Unstructured, test the connection between the AnzoGraph server and Elasticsearch. Make sure that Elasticsearch is running and then run the following telnet command:
    telnet <Elasticsearch_server_IP> <port>

    By default, the port range for Elasticsearch requests (http.port) is 9200-9300. If port 9200 is not available when Elasticsearch is started, Elasticsearch tries 9201 and so on until it finds an accessible port. Specify the HTTP request port that Elasticsearch is using.

Review the Optional C++ Extension Dependencies

The AnzoGraph installation includes C++ packages that extend AnzoGraph's built-in analytics to offer advanced Data Science functions as well as Apache Arrow integration. In addition, the C++ extensions are used to perform Anzo's advanced Source, Dataset, and Graphmart data profile analytics. Installing the C++ extensions is optional but strongly recommended. If you choose to install the extensions, the following additional C++ software package and support libraries are required to be installed.

Instructions on installing the C++ dependencies after AnzoGraph is installed are provided in Complete the Post-Installation Configuration.

  • libarchive13
  • libarmadillo10
  • libboost_filesystem1_71_0
  • libboost_iostreams1_71_0
  • libboost_system1_71_0
  • libgrpc++1
  • libflatbuffers1
  • libhdfs3
  • libnfs13
  • libserd-0-0
  • libsmb2
  • shadow-utils

Unset Linux Proxy Variables

Make sure that the Linux environment variables http_proxy and https_proxy are not set on the servers. The Anzo gRPC protocol cannot make connections to the database when proxies are enabled.

Use the Anzo Service User Account when Installing AnzoGraph

Because AnzoGraph offers features such as user-defined extensions, it is not secure software certified and should not be installed or run as the root user. In addition, since AnzoGraph accesses the data that Anzo writes on the shared File Store, it is important to install and run AnzoGraph with the same service account that runs Anzo. For more information, see Anzo Service Account Requirements.

Note the IP Addresses of the Cluster Servers

If you are installing AnzoGraph in a clustered setup, make note of the IP addresses for each of the servers in the cluster. The installation wizard will prompt you to enter the IP addresses during the installation. In addition, choose one server to be the leader server.

Once all of the prerequisites are in place, proceed to Install AnzoGraph for instructions on installing AnzoGraph.