Enabling Persistence (Preview)

By default, Graph Studio manages the data in Graph Lakehouse by automatically reloading Graphmart data into memory when Graph Lakehouse is restarted. You also have the option to enable persistence on the Graph Lakehouse instance. When persistence is enabled, Graph Lakehouse saves the data in memory to disk after every transaction. Each time Graph Lakehouse is restarted, the persisted data is automatically loaded back into memory. Once the data is loaded into memory, rather than automatically reloading active Graphmarts, Graph Studio checks to see if the last updated timestamp in Graph Lakehouse matches the last updated value in Graph Studio. If the timestamps match, Graph Studio does not initiate a reload. If there is a mismatch, Graph Studio reloads the active Graphmarts to update the data in memory to the latest version.

The Graph Lakehouse persistence feature is available as a Preview release, which means the implementation has recently been completed but is not yet thoroughly tested with Graph Studio and could be unstable. The feature is available for trial usage, but Altair recommends that you do not rely on Preview features in production environments.

This topic lists important information to consider before enabling persistence and provides instructions for enabling persistence in the Graph Lakehouse configuration file.

Important Considerations

Before enabling persistence, consider the following important notes:

  • In general, each Graph Lakehouse server needs access to about twice as much disk space as RAM on the server. By default, Graph Lakehouse saves data to the <install_path>/persistence directory on the local file system. You can also configure Graph Lakehouse to save data to a mounted file system. For more information, see Relocating AnzoGraph Directories.
  • Persisted data is unique to each Graph Lakehouse version and cannot be re-used after an upgrade. If you upgrade Graph Lakehouse and persistence is enabled, the database will not start until it is reinitialized to remove the persisted data. See Reinitialize the Database for instructions.
  • When persistence is enabled, transactional workloads that perform many concurrent write operations may experience a performance degradation due to the overhead of writing the data from each transaction to disk.

Enabling Persistence

Follow the steps below to enable the Graph Lakehouse save to disk option.

  1. Stop the database. See Stop the Database (Leave the System Management Daemon Running) for instructions.
  2. On the leader node, open the Graph Lakehouse settings file, settings.conf, in a text editor. The file is in the <install_path>/config directory.
  3. In settings.conf, find the following line in the file:
    enable_persistence=false
  4. Change the enable_persistence value to true:
    enable_persistence=true
  5. Save and close settings.conf.
  6. Restart the database to apply the configuration change. See Start the Database (the Daemon is Running) for instructions.

After each transaction, Graph Lakehouse saves the data in memory to disk in the location specified in the persistence_directory setting. Each time Graph Lakehouse is restarted, the persisted data is automatically loaded back into memory.

To avoid unnecessary reloads, make sure that the Graph Lakehouse connection in Graph Studio is configured to enable the Use AnzoGraph persistence if available option. See Connecting to Graph Lakehouse for more information.