Pre-Installation Requirements
This page describes the installation requirements and other important information to know before you install Graph Lakehouse. The list below summarizes the requirements and recommendations:
- Make sure that the host server operating system is RHEL or Rocky Linux 9.3+ and that the server has at least 16 GB RAM and 40 GB disk space available for Graph Lakehouse. For more information about the hardware, software, and firewall requirements, see Server and Cluster Requirements.
- Certain software packages are required to be installed before the Graph Lakehouse installation. The installer will not run until these prerequisites are installed. See Prerequisite Software for details and instructions.
- Additional dependencies are required to be installed to support Graph Lakehouse extensions like the remote read (load) and write service, the Data Science functions, and Apache Arrow integration. However, Altair recommends that you deploy these dependencies after Graph Lakehouse is installed because the installation includes a
.repo
file that can aid you in the installing the packages. See Post-Installation C++ Dependencies for details. - When the installer is run with elevated privileges (sudo mode), the installer can complete the Graph Lakehouse installation as well as the important post-installation configuration so that Graph Lakehouse is running and ready to use when the installation is finished. See Installation Modes and User Accounts for details about the installation modes and Graph Lakehouse users.
Prerequisite Software
The following software must be installed on the host servers before Graph Lakehouse is installed.
- Install a Java 21 Virtual Environment
- Install the GNU C Devel Library
- Install the GNU Binutils Library
Install a Java 21 Virtual Environment
All Graph Lakehouse servers are required to include a Java 21 virtual environment. OpenJDK 21 and GraalVM 21 are supported. For example, you can run the following command to install OpenJDK 21. Install the JVM on all servers in the cluster:
sudo dnf install java-21-openjdk
You do not need to set the $JAVA_HOME variable to use the Java installation. Graph Lakehouse's system management daemon (azgmgrd) requires JAVA_HOME, and it is set when services are configured as part of the installation (see Configuring the Graph Lakehouse Services and Starting the Database).
Install the GNU C Devel Library
All Graph Lakehouse servers are required to include the latest version of the GNU C glibc-devel
library for your operating system. On all servers in the cluster, run the following command to install glibc-devel:
sudo dnf install glibc-devel
Install the GNU Binutils Library
All Graph Lakehouse servers are required to include the latest version of the GNU binutils
library for your operating system. On all servers in the cluster, run the following command to install binutils:
sudo dnf install binutils
Post-Installation C++ Dependencies
Additional libraries are required to be installed on all servers in the cluster to support the C++ extensions that Graph Lakehouse offers, including the remote read (load) and write service, the Data Science functions, and the integration with Apache Arrow. Though you can install the C++ dependencies before you install Graph Lakehouse, if you wait until after the installation you can use the included csi-obs-cambridgesemantics-udxcontrib.repo
file to enable the Cambridge Semantics repository and install the C++ dependencies with or without internet access. For more information, see Installing the C++ Dependencies in the post-installation instructions.
Installation Modes and User Accounts
There are two modes in which you can run the installer, root (sudo) or non-root (current user). This section describes both modes and the user account and file ownership implications for each mode.
Mode | Description |
---|---|
Sudo Mode | Running the installer in sudo mode is the preferred method of installation. In sudo mode, the installer prompts you to enter the Graph Lakehouse service user name. Systemd units for the system management daemon (azgmgrd ) and database (anzograph ) processes are created in /etc/systemd/system . The units start Graph Lakehouse as the specified user, and file system permissions for the anzograph directory and any files that Graph Lakehouse writes are based on the same user. The services also configure the appropriate resource limits (ulimits) for Graph Lakehouse and set $JAVA_HOME for your Java or GraalVM installation. |
Non-Root Mode | When running the installer as a non-root user, the installer does not create users and file system permissions are based on the user account that performs the installation. Example systemd units, in the <install_path>/examples directory, are provided as a template for you to configure and enable manually. For more information, see Configuring the Graph Lakehouse Services and Starting the Database in the post-installation instructions. |
Once the prerequisites are in place, proceed to Install Graph Lakehouse for instructions on installing the software.