Post-Installation Configuration

This topic provides instructions for completing the required and optional post-installation configuration of AnzoGraph DB.

Installing the C++ Dependencies

Dependencies are required to be installed on all servers in the cluster to support the C++ extensions that AnzoGraph DB offers, including the remote read (load) and write service, the Data Science functions, and the integration with Apache Arrow. The installer provides a .repo file to aid you in configuring the Cambridge Semantics repository and installing the required software packages with or without internet access.

The ability to write to the /etc/yum.repos.d directory requires root access permissions. See Adding, Enabling, and Disabling a YUM Repository in the Red Hat documentation for more information on defining and using yum repositories.

The following packages need to be installed on each host server in the cluster:

libarchive
libarmadillo12
libboost_filesystem1_80_0
libboost_iostreams1_80_0
libboost_system1_80_0
libflatbuffers2
libhdfs3
libnfs13
libserd-0-0
libsmb2
shadow-utils

Install Dependencies via Internet Access to the Cambridge Semantics Repository

Follow the steps below if the AnzoGraph DB servers have external internet access and you want to install the dependencies directly from the Cambridge Semantics repository.

  1. Copy the csi-obs-cambridgesemantics-udxcontrib.repo file from the <install_path>/pre-req/yum.repos.d directory to the /etc/yum.repos.d directory. For example, the following command copies the file from the default installation path to /etc/yum.repos.d:
    sudo cp /opt/cambridgesemantics/pre-req/yum.repos.d/csi-obs-cambridgesemantics-udxcontrib.repo /etc/yum.repos.d
  2. Next, run the following command to enable the repository and install the required packages:
    sudo dnf install --enablerepo=crb $(cat <install_path>/pre-req/rh9-anzograph-requirements.txt)

    For example, on a server where AnzoGraph DB is installed in the default location:

    sudo dnf install --enablerepo=crb $(cat /opt/cambridgesemantics/pre-req/rh9-anzograph-requirements.txt)
  3. Repeat these steps on all servers in the cluster.

Install Dependencies without Internet Access via the Repository Mirror (tarball)

Follow the steps below if the AnzoGraph DB servers do not have external internet access and you want to install the dependencies from the mirrored Cambridge Semantics repository. The steps below give instructions for copying the repository to each AnzoGraph DB host server and configuring the .repo file accordingly. You can also chose to set up the mirror on a remote server that each of the AnzoGraph DB servers can access.

  1. From a computer that does have internet access, download the dependency tarball, csi-obs-cambridgesemantics-udxcontrib.rocky9.tar.xz, from the following Cambridge Semantics Google Cloud Storage location: https://storage.googleapis.com/csi-anzograph/udx/csi-os-contrib/rocky9/2023-03/20230321945/csi-obs-cambridgesemantics-udxcontrib.rocky9.tar.xz.

    You can run the following cURL command to download the tarball:

    curl -OL https://storage.googleapis.com/csi-anzograph/udx/csi-os-contrib/rocky9/2023-03/20230321945/csi-obs-cambridgesemantics-udxcontrib.rocky9.tar.xz(.sha512)
  2. Also from the computer that has internet access, download the repomd.xml.key from the following Cambridge Semantics Google Cloud Storage location: https://storage.googleapis.com/csi-rpmmd-pd/CambridgeSemantics:/UDXContrib/ubi-9/repodata/repomd.xml.key.

    You can run the following cURL command to download the file:

    curl -OL https://storage.googleapis.com/csi-rpmmd-pd/CambridgeSemantics:/UDXContrib/ubi-9/repodata/repomd.xml.key
    
  3. On each of the AnzoGraph DB servers, create a directory called /tmp/repo.
  4. Copy csi-obs-cambridgesemantics-udxcontrib.rocky9.tar.xz to the /tmp/repo directory on each server.
  5. Then run the following command to unpack the tarball in the /tmp/repo directory:
    tar -xvf csi-obs-cambridgesemantics-udxcontrib.rocky9.tar.xz

    The files are unpacked into subdirectories under /tmp/repo/dl/rocky9/csi-obs-cambridgesemantics-udxcontrib.

  6. Next, copy the repomd.xml.key file to the /tmp/repo/dl/rocky9/csi-obs-cambridgesemantics-udxcontrib directory on each of the AnzoGraph DB servers.
  7. Now, open the csi-obs-cambridgesemantics-udxcontrib.repo file in the <install_path>/examples/yum.repos.d directory. The contents of the file are shown below:
    [csi-obs-cambridgesemantics-udxcontrib]
    name=Contrib directory for CambridgeSemantics AnzoGraph UDX dependencies
    baseurl=https://storage.googleapis.com/csi-rpmmd-pd/CambridgeSemantics:/UDXContrib/ubi-9
    gpgkey=https://storage.googleapis.com/csi-rpmmd-pd/CambridgeSemantics:/UDXContrib/ubi-9/repodata/repomd.xml.key
    gpgcheck=1
    enabled=1
  8. Edit the csi-obs-cambridgesemantics-udxcontrib.repo file contents to replace the baseurl and gpgkey values so that they point to the repo files that you unpacked in the /tmp/repo directory. In addition, change the gpgcheck and enabled values from 1 to 0. The contents of the updated file are shown below:
    [csi-obs-cambridgesemantics-udxcontrib]
    name=Contrib directory for CambridgeSemantics AnzoGraph UDX dependencies
    baseurl=file:///tmp/repo/dl/rocky9/csi-obs-cambridgesemantics-udxcontrib
    gpgkey=file:///tmp/repo/dl/rocky9/csi-obs-cambridgesemantics-udxcontrib/repomd.xml.key
    gpgcheck=0
    enabled=0
  9. Save and close the file.
  10. Copy csi-obs-cambridgesemantics-udxcontrib.repo from <install_path>/pre-req/yum.repos.d to the /etc/yum.repos.d directory. For example, the following command copies the file from the default installation path to /etc/yum.repos.d:
    sudo cp /opt/cambridgesemantics/pre-req/yum.repos.d/csi-obs-cambridgesemantics-udxcontrib.repo /etc/yum.repos.d
  11. Next, run the following command to enable the repository and install the required packages:
    sudo dnf install --enablerepo=crb $(cat <install_path>/pre-req/rh9-anzograph-requirements.txt)

    For example, on a server where AnzoGraph DB is installed in the default location:

    sudo dnf install --enablerepo=crb $(cat /opt/cambridgesemantics/pre-req/rh9-anzograph-requirements.txt)

Repeat the steps above as needed to install the dependencies on all servers in the cluster.

Optimizing the Linux Kernel Configuration for AnzoGraph DB

To streamline the configuration of the operating system for peak AnzoGraph DB performance, the installer includes a tuned profile that you can activate. Tuned is a daemon program that uses the udev device monitor to statically and dynamically tune operating system settings based on the specified profile. It is highly recommended that you activate the AnzoGraph DB tuned profile.

Tuned and Performance Tuning with Tuned and Tuned-ADM in the RedHat Performance Tuning Guide provide an overview of the tuned daemon and more information on using the tuned service to improve performance of specific workloads.

Activating the Tuned Profile

The profile, called azg, is in the <install_path>/examples/tuned-profile directory and consists of two files: tuned.conf and additional-tuneables.sh. For details about the files, see Tuned Profile Reference below.

The ability to write to the /etc/tuned directory and activate the tuned profile requires root access permissions.

To activate the azg profile, follow the steps below. Complete these steps on all servers in the cluster:

  1. If you ran the installer with root (sudo) privileges, you can skip this step. The installer copied the tuned profile to the etc/tuned directory but it did not automatically activate the profile. If you ran the installer as a non-root user, copy the azg directory from <install_path>/examples/tuned-profile to the /etc/tuned directory. For example, the following command copies azg from the default installation path to /etc/tuned:
    sudo cp -r /opt/cambridgesemantics/examples/azg /etc/tuned
  2. Next, tuned is installed by default with RHEL 9. If you are using Rocky9, you may need to installed tuned. You can run the following command to install the program:
    sudo dnf install -y tuned
  3. Run the following command to activate the azg profile:
    sudo tuned-adm profile azg

The host servers are now configured to use the tuned profile that is optimal for AnzoGraph DB.

To disable tuned profiles, you can run the following command:

sudo tuned-adm off

After running the command, no tuned profiles will be active.

Tuned Profile Reference

This section describes the tuned AnzoGraph DB profile files and the kernel configuration changes that they apply.

tuned.conf

The table below describes the Linux kernel configuration settings that are modified by tuned.conf.

Setting Description AZG Profile Change
vm.dirty_ratio Specifies the percentage of system memory that can be occupied by "dirty" data before flushing the cache to disk. Dirty data are pages in memory that have been updated and do not match what is stored on disk. Reduces vm.dirty_ratio to 2% to increase the frequency with which the system cache is flushed.
vm.swappiness Controls the tendency of the kernel to move processes out of physical memory and onto the swap disk. A value of 0 means the kernel avoids swapping processes out of physical memory for as long as possible. A value of 100 tells the kernel to aggressively swap processes out of physical memory to the swap disk. Sets vm.swappiness to 30.
vm.max_map_count Sets the limit on the maximum number of memory map areas a process can use. Since AnzoGraph DB is memory intensive, it may reach the default maximum map count of 65535 and be shut down by the operating system. Increases vm.max_map_count to 2097152.
net.ipv4.tcp_rmem Controls the size of the receive buffer for TCP connections. It sets the minimum, default, and maximum sizes of the buffer in bytes. Sets tcp_rmem to "4096 87380 16777216".
net.ipv4.tcp_wmem Controls the size of the send buffer for TCP connections. It sets the minimum, default, and maximum sizes of the buffer in bytes. Sets tcp_wmem to "4096 16384 16777216".
net.ipv4.udp_mem Controls the amount of memory that can be allocated for the kernel's UDP buffer. It sets the minimum, default, and maximum sizes of the buffer in bytes. Sets udp_mem to "3145728 4194304 16777216".
transparent_hugepages Controls whether Transparent Huge Pages (THP) is enabled or disabled system-wide. When THP is enabled system-wide, it can dramatically degrade AnzoGraph DB performance. Disables THP by setting transparent_hugepages to never.

additional-tunables.sh

The additional-tuneables.sh script is called by tuned.conf and configures the following Linux kernel configuration settings so that they are optimal for AnzoGraph DB.

Setting Description AZG Profile Change
overcommit_memory Controls whether obvious overcommits of the address space are allowed. Sets overcommit_memory to 0 to ensure that very large overcommits are not allowed but some overcommits can be used to reduce swap usage.
overcommit_ratio Controls the percentage of memory that is allowed to be used for overcommits. Sets overcommit_ratio to 50%.
transparent_hugepage/defrag Though the tuned profile disables Transparent Huge Pages (THP) system-wide, this setting controls whether huge pages can still be enabled on a per process basis (inside MADV_HUGEPAGE madvise regions). Sets transparent_hugepage/defrag to madvise so that the kernel only assigns huge pages to individual process memory regions that are specified with the madvise() system call.
tcp_timestamps Controls whether TCP timestamps are enabled or disabled. Sets tcp_timestamps to 0, which disables TCP timestamps in order to reduce performance spikes related to timestamp generation.

Configuring the AnzoGraph DB Services and Starting the Database

When running the installer as root (sudo), the installer automatically creates AnzoGraph DB systemd services in the /etc/systemd/system directory for azgmgrd, anzograph, and jetty (if the frontend is installed). In addition, the installer asks if you want to automatically start the services at the end of the installation. If AnzoGraph DB is already running, see Get Started for next steps.

When running the installer as a non-root user, the installer does not automatically create systemd services in the /etc/systemd/system directory, but example service files are available in the <install_path>/examples/systemd-services directory for you to customize and enable manually. It is highly recommended that you implement the services because they are configured to tune user resource limits (ulimits) for the AnzoGraph DB process as well as set $JAVA_HOME to the JVM path.

The azgmgrd service needs to be enabled on all host servers. But the anzograph and jetty services should only be enabled on single server environments and on the leader node if you have a cluster. The anzograph and jetty services should not be invoked on the compute/worker nodes in a cluster.

After tailoring the service files to your environment, follow these steps to enable and start the services:

  1. Copy the azgmgrd service file from the <install_path>/examples/systemd-services directory to the /etc/systemd/system directory on each server in the cluster.
  2. Then run the following commands on each server in the cluster to start the azgmgrd service and enable the service to start automatically each time the host server is started.
    sudo systemctl start azgmgrd.service
    sudo systemctl enable azgmgrd.service 
  3. For single server environments and on the leader node in a cluster, copy the anzograph and jetty files from the <install_path>/examples/systemd-services directory to the /etc/systemd/system directory. Do not copy these services on compute/worker nodes.
  4. Next, run the following commands to start the anzograph and jetty services and enable the services to start automatically each time the host server is started.
    sudo systemctl start anzograph.service
    sudo systemctl enable anzograph.service
    sudo systemctl start jetty.service
    sudo systemctl enable jetty.service 

If you do not employ the systemd services, make sure that you manually set $JAVA_HOME to the JVM installation path. If you set up a cluster, set $JAVA_HOME on each server in the cluster.

For next steps, see Get Started.