Connecting to AnzoGraph
This topic provides instructions for configuring the connection to AnzoGraph. For information about managing AnzoGraph servers, see AnzoGraph Server Administration.
Do not connect multiple Anzo instances to the same AnzoGraph instance. Since AnzoGraph is stateless and Anzo manages all of the data, connecting more than one Anzo instance to the same AnzoGraph instance causes severe data management conflicts that result in unexpected behavior. This type of configuration is not supported.
- In the Administration application, expand the Connections menu and click AnzoGraph. Anzo opens the AnzoGraph connection overview screen, which lists any existing connections. For example:
- On the AnzoGraph screen, click Add AnzoGraph to add a connection. Anzo displays the Create AnzoGraph dialog box.
- On the Basic tab, type a name for the engine in the Title field.
- In the optional Description field, type a description for the graph query engine. If you leave this field blank, Anzo creates a description when you save the configuration.
- In the Host field, type the AnzoGraph server host name or IP address. If you have a cluster, type the name or IP address of the leader server.
- In the AnzoGraph User field, type the username that was created when AnzoGraph was installed.
- Type the password for the AnzoGraph user in the AnzoGraph Password and Confirm Password fields.
- If this AnzoGraph instance will host data from unstructured pipelines, click the Elasticsearch Configuration drop-down list and select the Elasticsearch instance to associate with this AnzoGraph connection. For information about configuring an Elasticsearch connection, see Connecting to Elasticsearch.
- Click Test Connection to check if Anzo can connect to AnzoGraph. If the connection fails, make sure that AnzoGraph is running and that you typed the correct username and password.
- Optional: Click the Advanced tab and configure any of the optional advanced settings. For example:
The list below describes each of the advanced settings:
- Instance URI: The URI for this AnzoGraph instance. Anzo automatically assigns an instance URI. If you specify a custom URI, make sure that the URI is valid and unique.
- Trust All TLS Certificates: Indicates whether Anzo should trust the AnzoGraph certificates for this connection. Cambridge Semantics recommends that you accept the default value of enabled.
- AnzoGraph Concurrent Queries: The maximum number of queries that Anzo can send to AnzoGraph concurrently. The default value is 10 queries. Cambridge Semantics recommends that you accept the default value. If you want to increase the number of concurrent queries, Cambridge Semantics recommends that you choose a value between 10 and 20.
- AnzoGraph Connection Timeout (seconds): This setting controls how often (in seconds) Anzo checks the status of the connection to this AnzoGraph instance. The connection is tested every N seconds, where N is the value of this setting. The default value is 60. If the test fails, Anzo re-tests the connection every 15 seconds for 2 minutes to rule out a brief network glitch. If the connection continues to fail after 2 minutes, the status is changed to "Offline." If the connection is re-established within the 2-minute window, Anzo determines whether the connection came back automatically or whether AnzoGraph was restarted.
- Use AnzoGraph Persistence if Available: This setting controls how Anzo manages graphmart data if persistence is enabled for this data source and AnzoGraph is restarted.
The Use AnzoGraph Persistence if Available setting is enabled by default but persistence is disabled for AnzoGraph by default. For information about how Anzo manages the data when persistence is enabled and for instructions on enabling persistence, see Using AnzoGraph Persistence (Preview).
- Force Reload of Graphmart Data During Anzo Startup or when Datasource Enabled: This option is enabled by default and means that Anzo forces a reload of active graphmarts when one of the following actions occur: 1. Anzo restarts and reconnects to AnzoGraph, or 2. Anzo restarts and a user manually re-enables this data source by selecting Enable and reload AnzoGraph Datasource from the menu on the AnzoGraph administration screen. When this option is disabled and AnzoGraph persistence is also disabled, graphmarts must be reloaded by clicking the Reset and Reload all Graphmarts button on the AnzoGraph screen after the connection is re-established due to an AnzoGraph restart.
If AnzoGraph persistence is enabled and Force reload of Graphmart data... is disabled, Anzo may force a reload if the last updated timestamp in AnzoGraph does not match the last updated value in Anzo.
- Keep AnzoGraph Datasource Enabled on Anzo Startup: This option is enabled by default and means that Anzo leaves the AnzoGraph data source online in a "Ready to use" state if Anzo is restarted (if this data source is online at the time Anzo is restarted). When this option is disabled, Anzo disables this data source when Anzo is restarted. When Anzo comes online, this source must be manually enabled by selecting Enable and reload AnzoGraph Datasource from the menu on the AnzoGraph administration screen. For example:
- Port: The port to use for communication between AnzoGraph and Anzo. The default value is 5700, the Anzo protocol (gRPC) port for secure communication. Do not change the value unless instructed by Cambridge Semantics Support.
- AnzoGraph Management Port: The SSL system management port for AnzoGraph. The default value is 5600. Do not change the value unless instructed by Cambridge Semantics Support.
- Callback Hostname: The Callback Hostname is the Anzo server to use when AnzoGraph makes service callbacks. If you have multiple Anzo servers and one or more of them are not routable by the AnzoGraph server, the Callback Hostname is the Anzo host that AnzoGraph can target when making service calls.
- Readonly Replica: This option is for use if you have multiple Anzo servers, and only one of those servers loads graphmarts to AnzoGraph. When Is Replica is selected, Anzo treats this AnzoGraph as a read-only source so that this Anzo server can view the data in AnzoGraph but cannot change it.
- Vacuum: This option controls whether Anzo initiates an AnzoGraph vacuum process after each data load. The vacuum process improves data organization in memory, deduplicates data, and reclaims memory after data is deleted. Completing a vacuum after update operations is extremely important for maintaining overall query performance and memory allocation accuracy.
Do not disable vacuum unless you are instructed to do so by Cambridge Semantics Support.
- Gather Statistics on Load: This option controls whether Anzo initiates AnzoGraph's internal statistics gathering queries after loading data. Gathering statistics helps the query planner generate ideal query execution plans when queries are run. When this option is enabled, the AnzoGraph statistics queries are run immediately after a Graphmart is loaded. It increases Graphmart load time but reduces execution time for the first analytic queries, such as when a Hi-Res Analytic dashboard is created. When this option is disabled (the checkbox is clear), AnzoGraph automatically performs statistics gathering when the first queries are run, increasing the execution time for the initial queries.
Cambridge Semantics recommends that you leave Gather Statistics on Load enabled so that AnzoGraph gathers statistics at the end of a load rather than during query execution. Since loads take longer than queries, adding more time to the load is less noticeable than waiting for statistics to be generated during initial query execution.
- Use Priority Queue Query Manager: This option controls whether Anzo provides a view of the queries that are in the queue waiting to be run. The queued queries are displayed in the System Query Audit log.
Enabling or disabling this option after saving the initial configuration requires a restart of Anzo.
- Enable Detailed Query Timing: When the Priority Queue Query Manager is enabled, this option controls whether Anzo obtains detailed timing statistics for every AnzoGraph query. If this option is enabled, Anzo sends additional statistics gathering queries to AnzoGraph for each user query. The extra query timing details, such as query compilation time, compilation statistics, and a query summary, are displayed in the System Query Audit log. For more information about this setting, see AnzoGraph Detailed Query Timing Reference.
Enabling detailed query timing increases the AnzoGraph workload and may decrease overall query performance.
- Max Allowed Duration for System Operations (Minutes): This option sets a limit on the number of minutes Anzo waits for AnzoGraph to complete system operation related queries, such as queries for CPU and memory usage statistics. The default value is 2 minutes. If Anzo is waiting on system information from AnzoGraph and AnzoGraph does not respond within the specified time, Anzo cancels the request.
- Max Allowed Duration for Queries: This option sets a limit on the amount of time that Anzo waits for AnzoGraph to complete a user query (such as dashboard, data layer, or Query Builder queries). By default, Anzo waits indefinitely. To set a maximum duration, specify the amount of time in any combination of days, hours, and minutes. For example, specifying 1d sets the maximum duration to one day. Specifying 10h, sets the maximum duration to 10 hours, and specifying 1d12h30m sets the duration to 1 day, 12 hours, and 30 minutes. If Max Allowed Duration for Queries is set and a query does not complete in the specified time, Anzo cancels the request regardless of whether AnzoGraph has returned partial results.
- Click Save to save the configuration. Anzo connects to AnzoGraph and opens the Graphmarts tab. For example:
To change configuration details, click the Configuration tab and adjust values as needed. The right side of the screen shows connection status as well as memory usage details, overall data statistics, and graphmart details. For information about loading data to AnzoGraph, see Creating a New Graphmart.