Installing AnzoGraph on a Single Server

This topic provides instructions for installing AnzoGraph on a single server. For information about server requirements, see AnzoGraph Requirements.

Important Because AnzoGraph offers features such as user-defined extensions, it is not secure software certified and should not be installed or run as the root user. In addition, since AnzoGraph will access the data that Anzo writes on the shared file store, it is important to install and run AnzoGraph with the same service account that runs Anzo. For more information, see Anzo Service User Account Requirements.

For instructions on setting up an AnzoGraph cluster, see Installing AnzoGraph on a Cluster.

  1. Complete the Pre-Installation Configuration
  2. Install AnzoGraph
  3. Complete the Post-Installation Configuration

Complete the Pre-Installation Configuration

Install the Required Software

Install GCC and BZIP2 (Required for all Deployments)

Make sure that the host server has the following software packages installed. These packages are required for all deployments:

  • GNU Compiler Collection (GCC): AnzoGraph requires the latest version of the GCC tools for your operating system. Run the following command to install GCC:
    sudo yum install gcc
    NoteSpecifically, AnzoGraph requires the glibc, glibc-devel, and gcc-c++ libraries. Typically, when you install GCC by running yum install gcc, those libraries are included as part of the package. In rare cases, depending on the host server configuration, installing GCC excludes certain libraries. If AnzoGraph fails to start and you receive a "Compilation failed" message, it may indicate that some of the required libraries are missing. To install the missing libraries, run the following command:
    sudo yum install glibc glibc-devel gcc-c++
  • bzip2: Required for unpacking the AnzoGraph tool set during installation. Run the following command to install bzip2:
    sudo yum install bzip2

Install OpenJDK 11 (Required for Anzo Unstructured and Data Toolkit Service Deployments)

AnzoGraph uses a Java client interface (datatoolkit-<version>.jar) to communicate with data sources when the Data Toolkit service is used to incorporate data from remote endpoints into graphmarts. AnzoGraph also uses the Java client to communicate with Elasticsearch when Anzo Unstructured graphmarts are deployed. Java Development Kit version 11 is required for using the Java client. Follow the instructions below to install OpenJDK.

  1. Run the following command to install OpenJDK 11:
    sudo yum install java-11-openjdk
    NoteDo not set the $JAVA_HOME variable to use the JDK installation at this time. AnzoGraph's system management daemon requires JAVA_HOME, and it is set as part of the post-installation configuration. In addition, the Elasticsearch plugin is deployed after AnzoGraph is installed.
  2. If your organization uses Anzo Unstructured, test the connection between the AnzoGraph server and Elasticsearch. Make sure that Elasticsearch is running and then run the following telnet command:
    telnet <Elasticsearch_server_IP> <port>

    By default, the port range for Elasticsearch requests (http.port) is 9200-9300. If port 9200 is not available when Elasticsearch is started, Elasticsearch tries 9201 and so on until it finds an accessible port. Specify the HTTP request port that Elasticsearch is using.

Configure User Resource Limits

Cambridge Semantics recommends that you tune the user resource limits (ulimits) for your Linux distribution to increase the limits for the following resources:

  • Increase the open files limit to at least 4096.
  • Increase the limit for the following resources to unlimited:
    • cpu time
    • file locks
    • file size
    • max memory size
    • max user processes
    • virtual memory

To view the current ulimits, run ulimit -a. To permanently change ulimits, modify the /etc/security/limits.conf file. For more information, see How to set ulimit values in the RHEL support documentation.

Note Also make sure that the Linux environment variables http_proxy and https_proxy are not set. The Anzo gRPC protocol cannot make connections to the database when proxies are enabled.

Install AnzoGraph

Follow the instructions below to install AnzoGraph.

Important Complete the steps below as the Anzo service user.
  1. If necessary, run the following command to become the Anzo service user:
    # su name

    Where name is the name of the service user. For example:

    # su anzo
  2. If necessary, run the following command to make the AnzoGraph installation script executable:
    chmod +x script_name
  3. Run the following command to start the installation wizard:
    ./script_name

    The script unpacks the JRE and then waits for input before starting the installation.

  4. Press Enter to proceed with the installation. The wizard displays the AnzoGraph license agreement.
  5. Review the license agreement. Press Enter to scroll through the terms. At the end of the agreement, type 1 to accept the terms or type 2 to disagree and stop the installation.
  6. The wizard prompts you to specify which components to install. Specify 1 (AnzoGraph) and press Enter.
  7. Specify the path and directory for the AnzoGraph installation. Press Enter to accept the default installation path or type an alternate path and then press Enter.
  8. At the server installation type prompt, accept the default option 1 (Standalone) and press Enter.
  9. Indicate whether this installation is for use with Anzo. Press Enter for Yes. Answering yes configures AnzoGraph to use the settings that are optimal for Anzo. Answering no configures the settings that are optimal for AnzoGraph standalone use.
  10. Set up the AnzoGraph admin user. Type a username to use for authentication. Anzo will use this username to connect to AnzoGraph. Then press Enter.
  11. Type a password for the Anzo username and press Enter. Note: Some special characters, such as $ and *, are treated as parameters in bash. When typing a password, avoid or escape special characters to remove their special meaning to the command line. For more information, see Quoting in the Bash Reference Manual.
  12. Configure any additional AnzoGraph settings. If Cambridge Semantics Support provided custom settings to use for your configuration, type the supplied values and then press Enter. Separate multiple settings with the new line escape sequence, \n. For example, the following entry sets two custom settings: truncate_clob=true\npersistence_directory=/data/.
    NoteIf you are installing AnzoGraph as the root user, add the following value to this prompt:
    enable_root_user=true
  13. The wizard extracts the AnzoGraph files and completes the installation. Proceed to Installing AnzoGraph on a Single Server below to complete the initial configuration and start the database.

Complete the Post-Installation Configuration

Deploy the Data Toolkit Plugin (Anzo Unstructured and Data Toolkit Service Environments)

If your organization uses Anzo Unstructured or the Anzo Data Toolkit Service, copy the datatoolkit-1.0.0.jar plugin provided by Cambridge Semantics to the <install_path>/lib/udx directory on the AnzoGraph server.

Configure and Start the AnzoGraph Services (All Environments)

There are three processes involved in the initial AnzoGraph startup. And subsequent starts involve one or more of these steps depending on the state of AnzoGraph and the server:

  1. The first process involves the configuration of the Linux kernel. The default kernel configuration for the following settings is not optimal for AnzoGraph:
    • transparent_hugepage: Transparent Huge Pages (THP) are enabled by default and can severely degrade AnzoGraph performance. THP should be disabled for AnzoGraph.
    • max_map_count: By default, the maximum number of memory map areas that a process can use is 65535. Since AnzoGraph is memory intensive, it may reach the maximum map count and be shut down by the operating system. AnzoGraph requires a value of 2097152.

    At startup, AnzoGraph checks these settings and returns a warning if the values are not suitable. You are required to make the kernel changes or configure AnzoGraph to start with non-optimal configurations. The AnzoGraph deployment includes a script (<install_path>/bin/azg_system_config) that makes the required kernel configuration changes. Superuser privileges are required to make the changes, however, and each time the host server is rebooted the script must be run again because the kernel configuration reverts to the defaults.

  2. The second process involves the AnzoGraph system management daemon, azgmgrd. This very lightweight program manages AnzoGraph communication. Though azgmgrd is especially important for managing connections between servers in a cluster, it is also required for single server installations. It must be running to start the database, but it typically does not need to be restarted unless you are upgrading AnzoGraph or the host server is rebooted. It does not need to be stopped and started each time the database is restarted.
  3. The third process involves starting the database with the system manager.

To ensure that the right account/permissions are used to perform the three steps above (i.e., the root user makes the kernel changes and the Anzo service account starts the system management daemon and the database) whenever the host server is rebooted, Cambridge Semantics recommends that you configure services to run the startup steps. This section provides instructions for configuring the three services.

Note Root user privileges are required to complete the tasks below.

Configure the Linux Kernel Configuration Service

Follow the instructions below to set up a service to apply the Linux kernel configuration changes any time the AnzoGraph host server is restarted.

Note If making the kernel changes is not possible, you can set the os_allow_alternate_vm_config value to true in the AnzoGraph settings file. This setting enables AnzoGraph to start with non-optimal Linux configurations. See Changing AnzoGraph Configuration Settings for instructions.
  1. Run the following command to copy the AnzoGraph system configuration script, azg_system_config, to the root directory:
    # cp /install_path/bin/azg_system_config /root/

    For example:

    # cp /opt/anzograph/bin/azg_system_config /root/
  2. Run the following command to remove "sudo" from the azg_system_config script:
    # sed -i 's/sudo//g' /root/azg_system_config
  3. Create a file called azg_system_config.service in the /usr/lib/systemd/system directory. For example:
    # vi /usr/lib/systemd/system/azg_system_config.service
  4. Add the following contents to azg_system_config.service:
    [Unit]
    Description=Configure Linux for AnzoGraph
    [Service]
    Type=oneshot
    ExecStart=/root/azg_system_config
    [Install]
    WantedBy=multi-user.target
    
  5. Save and close the file.
  6. Run the following commands to start and enable the new service:
    # systemctl start azg_system_config.service
    # systemctl enable azg_system_config.service

Configure the AnzoGraph System Management Service

Follow the instructions below to set up a service that starts the AnzoGraph system management daemon (azgmgrd) as the Anzo service user if the host server is restarted.

  1. Create a file called azgmgrd.service in the /usr/lib/systemd/system directory. For example:
    # vi /usr/lib/systemd/system/azgmgrd.service
  2. Add the following contents to azgmgrd.service:
    [Unit]
    Description=AnzoGraph communication service
    # depends on NetworkManager-wait-online.service enabled
    Wants=network-online.target
    After=network-online.target
    [Service]
    Type=forking
    # The PID file is optional but recommended so that systemd
    # can identify the main process of the daemon
    # PIDFile=/var/run/azgmgrd.pid
    WorkingDirectory=install_path
    StandardOutput=syslog
    StandardError=syslog
    User=Anzo_service_user
    UMask=0022
    Environment=PATH=/sbin:/bin:/usr/sbin:/usr/bin:/install_path/bin:/install_path/tools/bin
    # Uncomment the following JAVA_HOME line for Anzo Unstructured and/or 
    # Data Toolkit Service environments 
    # ENVIRONMENT=JAVA_HOME=/usr/lib/jvm/jre-11
    ExecStart=/install_path/bin/azgmgrd /install_path/
    CPUAccounting=false
    MemoryAccounting=false
    [Install]
    WantedBy=multi-user.target

    For example:

    [Unit]
    Description=AnzoGraph communication service
    # depends on NetworkManager-wait-online.service enabled
    Wants=network-online.target
    After=network-online.target
    [Service]
    Type=forking
    # The PID file is optional but recommended so that systemd
    # can identify the main process of the daemon
    # PIDFile=/var/run/azgmgrd.pid
    WorkingDirectory=/opt/anzograph
    StandardOutput=syslog
    StandardError=syslog
    User=anzo
    UMask=0022
    Environment=PATH=/sbin:/bin:/usr/sbin:/usr/bin:/opt/anzograph/bin:/opt/anzograph/tools/bin
    # Uncomment the following JAVA_HOME line for Anzo Unstructured and/or 
    # Data Toolkit Service environments
    ENVIRONMENT=JAVA_HOME=/usr/lib/jvm/jre-11
    ExecStart=/opt/anzograph/bin/azgmgrd /opt/anzograph/
    CPUAccounting=false
    MemoryAccounting=false
    [Install]
    WantedBy=multi-user.target
  3. Save and close the file.
  4. Run the following commands to start and enable the new service:
    # systemctl start azgmgrd.service
    # systemctl enable azgmgrd.service

Configure the AnzoGraph Database Service

Follow the instructions below to set up a service that will start AnzoGraph as the Anzo service user. This service is configured to run after the system management daemon is started.

  1. Create a file called anzograph.service in the /usr/lib/systemd/system directory. For example:
    # vi /usr/lib/systemd/system/anzograph.service
  2. Add the following contents to anzograph.service:
    [Unit]
    Description=AnzoGraph database service
    After=azgmgrd.service
    Wants=azgmgrd.service
    [Service]
    Type=forking
    # The PID file is optional but recommended so that systemd
    # can identify the main process of the daemon
    # PIDFile=/var/run/anzograph.pid
    WorkingDirectory=install_path
    StandardOutput=syslog
    StandardError=syslog
    RemainAfterExit=no
    Restart=on-failure
    RestartSec=60s
    User=Anzo_service_user
    UMask=0022
    Environment=PATH=/sbin:/bin:/usr/sbin:/usr/bin:/install_path/bin:/install_path/tools/bin
    ExecStart=/install_path/bin/azgctl -start
    [Install]
    WantedBy=multi-user.target

    For example:

    [Unit]
    Description=AnzoGraph database service
    After=azgmgrd.service
    Wants=azgmgrd.service
    [Service]
    Type=forking
    # The PID file is optional but recommended so that systemd
    # can identify the main process of the daemon
    # PIDFile=/var/run/anzograph.pid
    WorkingDirectory=/opt/anzograph
    StandardOutput=syslog
    StandardError=syslog
    RemainAfterExit=no
    Restart=on-failure
    RestartSec=60s
    User=anzo
    UMask=0022
    Environment=PATH=/sbin:/bin:/usr/sbin:/usr/bin:/opt/anzograph/bin:/opt/anzograph/tools/bin
    ExecStart=/opt/anzograph/bin/azgctl -start
    [Install]
    WantedBy=multi-user.target
  3. Save and close the file.
  4. Run the following commands to start and enable the new service:
    # systemctl start anzograph.service
    # systemctl enable anzograph.service

Once the services are in place and enabled, AnzoGraph should be running. Any time you start and stop the database, run the following systemctl commands: sudo systemctl stop anzograph and sudo systemctl start anzograph. You do not need to stop and start azgmgrd.

For instructions on configuring the connection to AnzoGraph in the Anzo console, see Connecting to AnzoGraph.

Related Topics