Managing the Automatic Restart Feature

AnzoGraph can be configured so that the system manager automatically restarts the database and evaluates the queries that were running if AnzoGraph shuts down unexpectedly. This topic describes the process that occurs when AnzoGraph automatically restarts and provides information about the configuration settings that control the functionality as well as administrative information for managing the evaluated queries.

Automated Restart Procedure

The steps below describe what occurs during the automatic restart process after AnzoGraph has crashed:

  1. The system manager restarts the database in safe mode. In safe mode, AnzoGraph is locked to users and returns the following message if a user runs a query: "AnzoGraph is running in safe-mode. Cannot execute query." In addition, running azgctl -status to check the status of the database returns the message "AnzoGraph is running in safe-mode." If persistence is enabled, the data that was in memory at the time of the crash is reloaded into memory.
  2. While in safe mode, AnzoGraph runs any queries that were inflight at the time of the crash. By executing the queries that were running, AnzoGraph tries to determine if the crash was directly caused by one of the inflight queries.
  3. Depending on the outcome of running the inflight queries, AnzoGraph does the following:
    • If all inflight queries run to completion in safe mode, they are all added to the warned_list. In addition, each query is copied to a file named <query_ID>.txt in the <install_path>/internal/auto_restart/<timestamp>/warned_list directory.

      When all inflight queries complete successfully, that means it is unlikely that any one of the queries on its own is the culprit for the crash. However, all of the queries are added to the warned list because it is possible that the combination of queries run concurrently could have caused the crash.

    • If any of the inflight queries fail or crash the database in safe mode, those queries are added to the denied_list. In addition, each query is copied to a file named <query_ID>.txt in the <install_path>/internal/auto_restart/<timestamp>/denied_list directory.

      If an inflight query fails, none of the inflight queries are added to the warned list. Instead, the failed queries are added to the denied list.

    • If AnzoGraph runs a query in safe mode and cannot determine if it should be added to the denied or warned list, those queries are copied to a file named <query_ID>.txt in the <install_path>/internal/auto_restart/<timestamp>/unanalyzed_list directory.
    • Metadata about the warned_list, denied_list, and unanalyzed_list queries is captured in the stc_blocklist system table.

    The auto_restart_directory setting in the system configuration file, <install_path>/config/settings.conf, controls the location of the auto_restart directories listed above. For more information about the setting, see the Automated Restart System Settings section below.

  4. After the inflight queries have been run, AnzoGraph restarts the database, loads the persisted data back into memory, and returns the system to normal operation.

To help prevent the circumstance that caused the database to crash, any queries that were added to the denied list are blocked from being executed when the system returns to normal operation. When a user runs a query, AnzoGraph compares that query with the denied list. If the query is on the list, the query is terminated and AnzoGraph returns an "Attempting to execute a denied-listed query" error message. Queries on the warned list are not blocked. A denied list query cannot be run unless it is removed from the denied list. This behavior is controlled by the ignore_deniedlist_queries setting. For more information about the setting, see the Automated Restart System Settings section below. For information about removing queries from the denied list, see Removing a Query from the Block List below.

Automated Restart System Settings

The automatic restart feature is controlled by the following four settings in <install_path>/config/settings.conf:

  • auto_restart_max_attempts: This setting specifies the number of times the system manager should attempt to start the database after a crash. The default value is 5, which means the system manager will attempt to restart the database a maximum of 5 times. Changing auto_restart_max_attempts to 0 disables the auto-restart feature.
  • auto_restart_time: This setting specifies the number of seconds to spend attempting to restart the database. If all attempts fail and this time limit is reached, the system manager stops trying to restart the database. The default value is 600, which means that the system manager will attempt to restart the database for a maximum of 600 seconds (10 minutes).
  • auto_restart_directory: This setting specifies the base location of the auto_restart directory, which contains the denied_list, warned_list, and unanalyzed_list directories. The default value is <install_path>/internal.
  • ignore_deniedlist_queries: This setting controls whether denied list queries are blocked from running or are allowed to be run when the database is returned to normal operation. The default value is false, which means denied list queries are not ignored and are therefore blocked from running. If ignore_deniedlist_queries is true, incoming queries are not compared with the denied list and are run.

Changing the auto_restart_max_attempts, auto_restart_time, or auto_restart_directory values requires a restart of the system management daemon, azgmgrd, as well as the database. See Starting and Stopping AnzoGraph for instructions.

Removing a Query from the Block List

AnzoGraph stores metadata about the denied and warned list queries in the stc_blocklist system table. To remove a query from either list, you remove the entry from the stc_blocklist table by running the REMOVE_FROM_BLOCKIST command.

REMOVE_FROM_BLOCKLIST '<list_name>' <query_ID>

Where <list_name> is the name of the list that the query is on and <query_ID> is the ID number for the query. To retrieve the list name and query ID values, run the following query to return the stc_blocklist contents:

SELECT * WHERE { TABLE 'stc_blocklist'} ORDER BY ?blocklist

For example:

/opt/anzograph/bin/azgi -c "select * where {table 'stc_blocklist'} order by ?blocklist" 
query | blocklist   | updated             | query_text                | part
------+-------------+---------------------+---------------------------+------
3587  | denied_list | 2020-08-25 14:29:27 | select * from <http://an..|    0
3592  | denied_list | 2020-08-25 14:29:32 | select * where {?s ?p ?o} |    0
3612  | warned_list | 2020-08-25 14:32:15 | select * from <http://an..|    0

In the results, the <list_name> is the value in the blocklist column, and <query_id> is the value in the query column. Running the following command removes the first entry from the stc_blocklist table, which removes that query from the denied list.

REMOVE_FROM_BLOCKLIST 'denied_list' 3587
Related Topics