Advanced Graph Lakehouse Connection Settings

This topic describes the connection settings that are available on the Advanced tab when you add a static Graph Lakehouse connection or the Configuration tab when you edit an existing connection.

Setting Description
Instance URI Defines the URI for this Graph Lakehouse instance. When this setting is empty Graph Studio automatically assigns an instance URI. If you specify a custom URI, make sure that the URI is valid and unique.
Trust All TLS Certificates Indicates whether Graph Studio should trust the Graph Lakehouse certificates for this connection. Altair recommends that you accept the default value of enabled.
AnzoGraph Concurrent Queries Specifies the maximum number of queries that Graph Studio can send to Graph Lakehouse concurrently. The default value is 10 queries. Altair recommends that you accept the default value. If you want to increase the number of concurrent queries, Altair recommends that you choose a value between 10 and 20.
AnzoGraph Connection Timeout Controls how often (in seconds) Graph Studio checks the status of the connection to this Graph Lakehouse instance. The connection is tested every N seconds, where N is the value of this setting. The default value is 60. If the test fails, Graph Studio re-tests the connection every 15 seconds for 2 minutes to rule out a brief network glitch. If the connection continues to fail after 2 minutes, the status is changed to "Offline." If the connection is re-established within the 2-minute window, Graph Studio determines whether the connection came back automatically or whether Graph Lakehouse was restarted.
Use AnzoGraph Persistence if Available Controls how Graph Studio manages graphmart data if persistence is enabled for this data source and Graph Lakehouse is restarted.

The Use AnzoGraph Persistence if Available setting is enabled by default but persistence is disabled for Graph Lakehouse by default. For information about how Graph Studio manages the data when persistence is enabled and for instructions on enabling persistence, see Enabling Persistence (Preview).

Force Reload of Graphmart Data During AnzoGraph Activation or Reconnection This option is enabled by default and means that Graph Studio forces a reload of active graphmarts when one of the following actions occur:
  1. Graph Studio restarts and reconnects to Graph Lakehouse.
  2. Graph Studio restarts and a user manually re-enables this data source by selecting Enable and reload AnzoGraph Datasource from the menu on the Graph Lakehouse administration screen.

When this option is disabled and Graph Lakehouse persistence is also disabled, graphmarts must be reloaded by clicking the Reset and Reload all Graphmarts button on the Graph Lakehouse screen after the connection is re-established due to an Graph Lakehouse restart.

If Graph Lakehouse persistence is enabled and Force reload of Graphmart data... is disabled, Graph Studio may force a reload if the last updated timestamp in Graph Lakehouse does not match the last updated value in Graph Studio.

Keep AnzoGraph Datasource Enabled on Anzo Startup This option is enabled by default and means that Graph Studio leaves the Graph Lakehouse data source online in a "Ready to use" state if Graph Studio is restarted (if this data source is online at the time Graph Studio is restarted). When this option is disabled, Graph Studio disables this data source when Graph Studio is restarted. When Graph Studio comes online, this source must be manually enabled by selecting Enable and reload AnzoGraph Datasource from the menu on the Graph Lakehouse administration screen. For example:

Port The port to use for communication between Graph Lakehouse and Graph Studio. The default value is 5700, the Graph Studio protocol (gRPC) port for secure communication. Do not change the value unless instructed by Cambridge Semantics Support.
AnzoGraph Management Port The SSL system management port for Graph Lakehouse. It is the port that Graph Studio uses to connect to the system manager and, in a cluster, the AnzoGraph system managers use to communicate to each other across the cluster. The default value is 5600. Do not change the value unless instructed by Cambridge Semantics Support.
Callback HostName The Graph Studio instance to call when Graph Lakehouse makes service callbacks. If you have multiple Graph Studio servers and one or more of them are not routable by the Graph Lakehouse server, the Callback HostName is the Graph Studio host that Graph Lakehouse can target when making service calls.
Readonly Replica This option is for use if you have multiple Graph Studio servers and only one of those servers loads graphmarts to Graph Lakehouse. When Readonly Replica is selected, Anzo treats this Graph Lakehouse instance as a read-only source so that Graph Studio can view the data in Graph Lakehouse but cannot change it.
Vacuum Controls whether Graph Studio initiates an Graph Lakehouse vacuum process after each load, reload, or refresh operation. The vacuum process improves data organization in memory, deduplicates data, and reclaims memory after data is deleted. Completing a vacuum after update operations is extremely important for maintaining overall query performance and memory allocation accuracy. Do not disable vacuum unless you are instructed to do so by Cambridge Semantics Support.
Gather Statistics on Load Controls whether Graph Studio initiates Graph Lakehouse's internal statistics gathering queries after loading data. Gathering statistics helps the query planner generate ideal query execution plans when queries are run. When this option is enabled, the Graph Lakehouse statistics queries are run immediately after a Graphmart is loaded. It increases Graphmart load time but reduces execution time for the first analytic queries, such as when a Hi-Res Analytics Dashboard is created. When this option is disabled (the checkbox is clear), Graph Lakehouse automatically performs statistics gathering when the first queries are run, increasing the execution time for the initial queries.

Altair recommends that you leave Gather Statistics on Load enabled so that Graph Lakehouse gathers statistics at the end of a load rather than during query execution. Since loads take longer than queries, adding more time to the load is less noticeable than waiting for statistics to be generated during initial query execution.

Use Priority Queue Query Manager Controls whether Graph Studio provides a view of the queries that are in the queue waiting to be run. The queued queries are displayed in the System Query Audit log.

Enabling or disabling this option after saving the initial configuration requires a restart of Graph Studio.

Enable Detailed Query Timing When the Priority Queue Query Manager is enabled, this option controls whether Graph Studio obtains detailed timing statistics for every Graph Lakehouse query. If this option is enabled, Graph Studio sends additional statistics gathering queries to Graph Lakehouse for each user query. The extra query timing details, such as query compilation time, compilation statistics, and a query summary, are displayed in the System Query Audit log. For more information about this setting, see AnzoGraph Detailed Query Timing.

Enabling detailed query timing increases the Graph Lakehouse workload and may decrease overall query performance.

Max Allowed Duration for System Operations Sets a limit on the duration of time Graph Studio waits for Graph Lakehouse to complete system operation related queries, such as queries for CPU and memory usage statistics. The default value is 2 minutes. If Graph Studio is waiting on system information from Graph Lakehouse and Graph Lakehouse does not respond within the specified time, Graph Studio cancels the request.
Max Allowed Duration for Queries Sets a limit on the amount of time that Graph Studio waits for Graph Lakehouse to complete a user query (such as dashboard, data layer, or Query Builder queries). By default, Graph Studio waits indefinitely. To set a maximum duration, specify the amount of time in any combination of days, hours, and minutes. For example, specifying 1d sets the maximum duration to one day. Specifying 10h, sets the maximum duration to 10 hours, and specifying 1d12h30m sets the duration to 1 day, 12 hours, and 30 minutes. If Max Allowed Duration for Queries is set and a query does not complete in the specified time, Graph Studio cancels the request regardless of whether Graph Lakehouse has returned partial results.
Use Minimal Number of SPARQL Rewriters When Graph Studio processes SPARQL queries before sending them to Graph Lakehouse, there is a set of rewrites it makes to try to optimize the query execution. This setting controls whether Graph Studio performs the full set of rewrites to optimize the query or whether it performs only the minimal required modifications. When this setting is disabled (the default value) Graph Studio performs the full set of rewrites. When this setting is enabled, Graph Studio performs only a minimal set of rewrites. Do not enable this setting unless you are instructed to do so by Cambridge Semantics Support.