Prepare the AnzoGraph Host Servers

Before deploying AnzoGraph, follow the instructions below to install the required software packages on each AnzoGraph host server. In addition to listing the software dependencies, this topic also includes important information about configuring user resource limits, ensuring that AnzoGraph is installed as the appropriate user, and recording the cluster IP addresses that are needed during the install process.

For information about host server requirements, see AnzoGraph Requirements.

Install GNU Compiler Collection (GCC)

AnzoGraph requires the latest version of the GCC tools for your operating system. Run the following command to install GCC:

sudo yum install gcc

Specifically, AnzoGraph requires the glibc, glibc-devel, and gcc-c++ libraries. Typically, when you install GCC by running yum install gcc, those libraries are included as part of the package. In rare cases, depending on the host server configuration, installing GCC excludes certain libraries. If AnzoGraph fails to start and you receive a "Compilation failed" message, it may indicate that some of the required libraries are missing. To install the missing libraries, run the following command:

sudo yum install glibc glibc-devel gcc-c++

Install BZIP2

BZIP2 is required for unpacking the AnzoGraph tool set during installation. Run the following command to install bzip2:

sudo yum install bzip2

Install OpenJDK 11

AnzoGraph uses a Java client interface, called the Graph Data Interface (GDI), to access Data Sources when Data Source Profiling is performed or when data from remote endpoints is blended into Graphmarts. AnzoGraph also uses the Java client to communicate with Elasticsearch when Anzo Unstructured Graphmarts are deployed. Java Development Kit version 11 is required for using the Java client. Follow the instructions below to install OpenJDK on all servers in the cluster.

  1. Run the following command to install OpenJDK 11:
    sudo yum install java-11-openjdk

    Do not set the $JAVA_HOME variable to use the JDK installation at this time. AnzoGraph's system management daemon requires JAVA_HOME, and it is set as part of the post-installation configuration (Complete the Post-Installation Configuration). In addition, the Java plugin is deployed after AnzoGraph is installed.

  2. If your organization uses Anzo Unstructured, test the connection between the AnzoGraph server and Elasticsearch. Make sure that Elasticsearch is running and then run the following telnet command:
    telnet <Elasticsearch_server_IP> <port>

    By default, the port range for Elasticsearch requests (http.port) is 9200-9300. If port 9200 is not available when Elasticsearch is started, Elasticsearch tries 9201 and so on until it finds an accessible port. Specify the HTTP request port that Elasticsearch is using.

For more information about the Graph Data Interface, see Blending Data from Remote Sources (Preview).

Configure User Resource Limits

Cambridge Semantics recommends that you tune the user resource limits (ulimits) for your Linux distribution to increase the limits for the following resources. Tune ulimits on all servers in the cluster.

  • Increase the open files (nofile) limit to at least 4096.
  • Increase the limit for the following resources to infinity:
    • address space (as)
    • CPU time (cpu)
    • file locks (locks)
    • file size (fsize)
    • max memory size (memlock)
    • max user processes (nproc)

To view the current ulimits, run ulimit -a. To permanently change ulimits, modify the /etc/security/limits.conf file. For more information, see How to set ulimit values in the RHEL support documentation.

Typically, as part of post-installation configuration, systemd services are set up to start and stop the AnzoGraph processes. When systemd starts a process, however, it uses the limits that are defined in the systemd service rather than the limits in /etc/security/limits.conf. In addition to changing the ulimits in limits.conf, it is important to set the limits in the AnzoGraph system management service. The service file contents shown in Configure the AnzoGraph System Management Service includes the recommended ulimit settings.

Unset Linux Proxy Variables

Make sure that the Linux environment variables http_proxy and https_proxy are not set on the servers. The Anzo gRPC protocol cannot make connections to the database when proxies are enabled.

Use the Anzo Service User Account when Installing AnzoGraph

Because AnzoGraph offers features such as user-defined extensions, it is not secure software certified and should not be installed or run as the root user. In addition, since AnzoGraph accesses the data that Anzo writes on the shared file store, it is important to install and run AnzoGraph with the same service account that runs Anzo. For more information, see Anzo Service Account Requirements.

Note the IP Addresses of the Cluster Servers

If you are installing AnzoGraph in a clustered setup, make note of the IP addresses for each of the servers in the cluster. The installation wizard will prompt you to enter the IP addresses during the installation. In addition, choose one server to be the leader server.

Once all of the prerequisites are in place, proceed to Install AnzoGraph for instructions on installing AnzoGraph.

Related Topics