Anzo Requirements

This page provides important guidelines to follow when choosing the hardware and software for Anzo host servers.

Hardware Requirements

The following guidelines apply to individual Anzo servers within production and development environments. Your Cambridge Semantics Customer Success manager can help you identify an overall Anzo and AnzoGraph deployment configuration that is appropriate for your solution and use cases.

Production Environments

Component Minimum Recommended Description
RAM 64 GB 128+ GB The Anzo system data source is a disk-based graph store (called a Journal or Volume). When the system source is queried, Anzo swaps the data from disk to memory on demand. Choosing a host server with more RAM increases the performance of system queries because the OS can store the journal data in its file cache, avoiding the need for Anzo to swap data from disk to memory. In addition, RAM is required to hold intermediate results for join queries.
Disk Space:
Anzo Install Path
100 GB 500+ GB The Anzo server installation disk needs to have enough space to store the Anzo system data source, Anzo log files, any plugins, and the Anzo client. In addition, if the local Sparkler compiler and Spark ETL engine are used on the Anzo server, consider that the disk size also needs to be sufficient for hosting all of the job-related .jar files.
Disk Space:
Shared File System
500 GB 1+ TB The shared file system stores all of the RDF data and ETL files that are shared between Anzo and all AnzoGraph, Anzo Unstructured, Spark, and Elasticsearch servers. For more information, see File Storage Requirements below.
vCPU 16 32 Once you provision sufficient RAM, performance depends on CPU capabilities. Keep in mind that you are provisioning for both a production database and a busy application server. A greater number of cores and high clock speed can make a dramatic difference in performance when there are many concurrent Anzo users.
Architecture 64-bit 64-bit Anzo is supported only on 64-bit architecture.

Development Environments

Component Minimum Recommended Description
RAM 32 GB 64+ GB These RAM guidelines assume that the development environment is intended to host smaller data volumes than the production environment and support one or two Anzo users at a time. For development environments with large data volumes and multiple concurrent users, increase the RAM amount.
Disk Space:
Anzo Install Path
100 GB 500+ GB The Anzo server installation disk needs to have enough space to store the Anzo system data source, Anzo log files, any plugins, and the Anzo client. In addition, if the local Sparkler compiler and Spark ETL engine are used on the Anzo server, consider that the disk size also needs to be sufficient for hosting all of the job-related .jar files.
Disk Space:
Shared File System
500 GB 1+ TB Typically the development environment mounts the same shared file system as the production environment.
vCPU 8 16 Like the RAM guidelines, these vCPU guidelines assume that the development environment is intended to host smaller data volumes than the production environment and support one or two Anzo users at a time. For development environments with large data volumes and multiple concurrent users, increase the number of vCPU.
Architecture 64-bit 64-bit Anzo is supported only on 64-bit architecture.

Software Requirements

This section lists the software requirements for Anzo servers and client workstations. It also includes important service account information and lists the supported single sign-on providers.

Do not run any other software, including anti-virus software, on the same server as Anzo. Additional software may be run in a development environment with the expectation of lowered Anzo performance. Cambridge Semantics strongly recommends that you do not run additional software on the Anzo server in a production environment.

Component Minimum Recommended Guidelines
Operating System
(Anzo Server)
RHEL/CentOS 6 RHEL/CentOS 7.9 Cambridge Semantics recommends that you tune the ulimits for your Linux distribution to increase the limits for certain resources. See Configure User Resource Limits for more information.
Microsoft Excel
(Client Workstation)
Excel 2003 Excel 2007+ The Anzo for Office data integration mapping tool plugin requires Microsoft Excel.
Web Browser
(Client Workstation)
Firefox 62+
Chrome 74+
Safari 12+
Chromium-Based
Chrome 90+ Use the latest versions of web browsers, especially if you are using a Chromium-based browser, as some older versions will not work with the Anzo user interface components.
Enterprise-Level Anzo Service User Account N/A N/A It is important to work with your IT organization to create an Anzo service user account at the enterprise level. The service user account needs to be associated with a central directory server (LDAP) so that it is available across Anzo environments and is managed in accordance with the permissions policies of your company. For more information, see Anzo Service Account Requirements below.

Anzo Service Account Requirements

For consistent and appropriate access management across current and future Anzo environments, it is important for the IT organization to create an enterprise-level, LDAP-managed Anzo service user account. The service account should be used when installing and running Anzo and all of the components in the platform, such as AnzoGraph, Spark, Elasticsearch, and Anzo Unstructured clusters. The service account should not have root user privileges but does need the following access:

  • The account must have read and write permissions for the Anzo component installation directories. The default Anzo server installation directory is /opt/Anzo.
  • The account must have read and write access to the shared file store, such as the NFS mount location, where all Anzo components will read and write files during the data onboarding processes. For more information about the shared file system requirements, see Deploying the Shared File System.

    Set the Anzo account User ID (UID) and Group ID (GID) to 1000. For integration between Anzo applications, it is important that the owner of files that are written to the shared file store is UID 1000, especially if you are considering Kubernetes-based deployments of Anzo applications.

  • The account must have a home directory on the Anzo host server.

Supported Single Sign-On Providers

Anzo supports the following single sign-on (SSO) protocols:

  • Basic SSO
  • Facebook OAuth
  • JSON Web Tokens (JWT)
  • Kerberos
  • OpenID Connect (OIDC)
  • Security Assertion Markup Language (SAML)
  • Spring Security OAuth2

For information about configuring SSO access, see Connecting to an SSO Provider.

Firewall Requirements

The table below lists the TCP ports to open on the Anzo host.

Port Description Access Needed...
61616 Anzo port used by the software development kit (SDK) and command line interface (CLI)
  • Between Anzo and users.
61617 Anzo SSL port used by the SDK and CLI
  • Between Anzo and users.
8022 Anzo SSH service port
  • Between Anzo and users.
8945 Anzo Administration service port
  • Between Anzo and users
8946 Anzo Administration service SSL port
  • Between Anzo and users.
80 Application HTTP port
  • Between Anzo and users.
443 Application HTTPS port.
  • Between Anzo and users.
3389 LDAP port
  • Between Anzo and the LDAP server.
9393 (optional) Optional Java Management Extensions (JMX) port. Enable this port if you want to connect to Anzo from a JMX client.
  • Between Anzo and the JMX client.
9394 (optional) Optional JMX SSL port. Enable this port if you want to make a secure connection to Anzo from a JMX client.
  • Between Anzo and the JMX client.
5700 The Anzo protocol (gRPC) port for secure communication between AnzoGraph and Anzo

For more information about the communication between Anzo and AnzoGraph, see Firewall Requirements in AnzoGraph Server Requirements.

  • Between Anzo and the AnzoGraph leader server.
5600 AnzoGraph's SSL system management port
  • Between Anzo and the AnzoGraph leader server.

File Storage Requirements

Anzo needs to have read and write access to a file storage system that can be shared between Anzo and all AnzoGraph, Anzo Unstructured, ETL Engine, and Elasticsearch servers. The supported storage systems are NFS, Hadoop Distributed File Systems (HDFS), File Transfer Protocol (FTP or FTPS) systems, Google Cloud Platform (GCP) storage, and Amazon Simple Cloud Storage Service (S3). In almost all cases, organizations create an NFS to mount to all of the servers in the Anzo environment. Mounted network file systems offer the best support and performance for reading and writing files.

For details and guidance on choosing the file system, see Deploying the Shared File System.

Standalone Spark Server Requirements

Anzo includes an embedded Spark ETL engine to integrate data from various sources. Depending on your server configuration, the embedded engine might not be sufficient for ingesting very large amounts of data. To support ingestion of large data sets, you can install standalone ingestion servers. The table below lists the recommended configuration for standalone Spark servers.

Component Recommendation
Available RAM 100+ GB
Disk Space 200+ GB
vCPU 16+
Related Topics