Anzo Requirements

This page provides important guidelines to follow when choosing the hardware and software for servers that host Anzo.

For information about Anzo Unstructured architecture requirements, see Anzo Unstructured Requirements and Recommendations.

Hardware Requirements

Cambridge Semantics lists above average production system hardware requirements as a guideline. These specifications are similar to what Cambridge Semantics currently provisions as the standard hosted environment. Larger production data sets running interactive queries may require significantly more powerful hardware and RAM configurations. Keep in mind that you are installing both a high performance graph database server as well as a fully featured application server. Provision production server hardware accordingly to avoid performance issues.

The table below provides a summary of the recommended hardware for production servers and the minimum requirements for test servers.

Component Minimum Recommended Guidelines
Available RAM 8 GB 32 GB or more Anzo needs enough RAM to map the database files into memory and run Anzo processes. If you produce large queries with joins across datasets, significant RAM is needed to hold intermediate results in memory.
Note: The JVisual VM program included with Anzo enables you to determine whether a server is memory-bound.
Disk space and type 10 GB (Anzo Server)
100 GB (Data)
100 GB (Anzo Server)
1+ TB (Data)
See File Storage Requirements below.
CPU 4 core 2.2GHz 8 core 3GHz+ Once you provision sufficient RAM and a high-performing I/O subsystem, performance depends on raw CPU capabilities. Keep in mind that you are provisioning for both a production database and a busy application server. Always use multi-core CPUs. A greater number of cores and high clock speed can make a dramatic difference in the performance of interactive queries.
Architecture 64-bit 64-bit Cambridge Semantics only supports the 64-bit versions of the server for production use.

Software Requirements

This section lists the software requirements for Anzo servers as well as user resource tuning recommendations and supported single sign-on providers.

Component Minimum Recommended Guidelines
Operating System RHEL/CentOS 6
Windows 2008
RHEL/CentOS 7
Windows 10
See Tuning User Resource Limitations (ulimits) below for information about setting ulimits on UNIX and Linux operating systems.
Microsoft Excel Excel 2003 Excel 2007+  
Web Browser Firefox 62+
Chrome 74+
Safari 12+
Chrome  

Tuning User Resource Limitations (ulimits)

Cambridge Semantics recommends that you tune the ulimits for your Linux distribution to increase the limits for certain resources. The list below describes the recommendations:

  • Increase the open files limit to at least 4096.
  • Increase the limit for the following resources to unlimited:
    • cpu time
    • file locks
    • file size
    • max memory size
    • max user processes
    • virtual memory

To view the current ulimits, run ulimit -a. For example, the default ulimits for a CentOS 7.5 operating system are shown below:

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 79607
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 4096
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

To change the value for a resource, run the following command:

ulimit -option new_value

For example, the following command changes the open files value to unlimited:

ulimit -n unlimited

Supported Single Sign-On Providers

Anzo supports the following single sign-on (SSO) protocols:

  • Basic SSO
  • Facebook OAuth
  • JSON Web Tokens (JWT)
  • Kerberos
  • OpenID Connect (OIDC)
  • Security Assertion Markup Language (SAML)
  • Spring Security OAuth2

Firewall Requirements

The table below lists the TCP ports to open on the Anzo host.

Port Description Access Needed...
61616 Anzo port used by the software development kit (SDK) and command line interface (CLI)
  • Between Anzo and users
61617 Anzo SSL port used by the SDK and CLI
  • Between Anzo and users
8022 Anzo SSH service port
  • Between Anzo and users
8945 Anzo Administration service port
  • Between Anzo and users
8946 Anzo Administration service SSL port
  • Between Anzo and users
80 Application HTTP port
  • Between Anzo and users
443 Application HTTPS port.
  • Between Anzo and users
3389 LDAP port
  • Between Anzo and the LDAP server
9393 (optional) Optional Java Management Extensions (JMX) port. Enable this port if you want to connect to Anzo from a JMX client.
  • Between Anzo and the JMX client
9394 (optional) Optional JMX SSL port. Enable this port if you want to make a secure connection to Anzo from a JMX client.
  • Between Anzo and the JMX client
5700 The Anzo protocol (gRPC) port for secure communication between AnzoGraph and Anzo

For more information about the communication between Anzo and AnzoGraph, see Firewall Requirements in AnzoGraph Server Requirements.

  • Between Anzo and the AnzoGraph leader server
5600 AnzoGraph's SSL system management port
  • Between Anzo and the AnzoGraph leader server
8100 This port is used when Anzo loads many statements in parallel, such as when loading a large data model from Anzo to AnzoGraph
  • From Anzo to each of the servers in the AnzoGraph cluster

File Storage Requirements

Anzo supports reading from and writing to storage systems such as a mounted NFS, Hadoop Distributed File Systems (HDFS), File Transfer Protocol (FTP or FTPS) systems, Google Cloud Platform (GCP) storage, and Amazon Simple Cloud Storage Service (S3).

Set up a storage system that is accessible by both Anzo and AnzoGraph. Depending on your infrastructure and use case, you might need to have enough storage space available for storing source data files, RDF load files, ETL job files, and other supporting files.

For more information about connecting to file storage, see Connecting to a File Store.

Standalone Ingestion Server Requirements

Anzo includes an embedded Spark ETL engine to integrate data from various sources. Depending on your server configuration, the embedded engine might not be sufficient for ingesting very large amounts of data. To support ingestion of large data sets, you can install standalone ingestion servers. This page lists the recommended configuration for standalone data ingestion servers.

Component Recommendation
Available RAM 100+ GB
Disk Space 200+ GB
CPU 16+ cores
Related Topics