Limiting the Number of Anzo Unstructured Status Journals

To limit the disk space used by Anzo Unstructured pipelines, you have the option to configure the Anzo Unstructured Distributed service to limit the number of status journals that are preserved on disk. When the specified limit is reached and a pipeline generates a new journal, the oldest journal is deleted.

Journals are removed based on their timestamps alone. The pipeline they are associated with is not a factor in determining the journals to delete.

Follow the instructions below to configure the Unstructured Distributed service to limit the number of status journals on disk.

  1. In the Administration application, expand the Servers menu and click Advanced Configuration. Click I understand and accept the risk.
  2. Search for the Anzo Unstructured Distributed bundle and view its details.
  3. Click the Services tab and expand Anzo Unstructured Distributed.
  4. Edit the com.cambridgesemantics.anzo.unstructured.distributed.defaultNumStatusJournalGlobalLimit property to specify the maximum number of status journals to keep on disk. The default value is -1, which is unlimited.
  5. After changing the value, click the checkmark icon () for that property to save the change.
  6. Restart Anzo to apply the configuration change.