Platform Shared File Storage Requirements

The Anzo Server and all other platform host servers need to have read and write access to a file storage system for sharing files. Though users can connect to and import files from various types of long-term storage systems, such as Hadoop Distributed File Systems (HDFS), File Transfer Protocol (FTP/S) systems, Google Cloud Platform (GCP) storage, Azure Cloud Storage, and Amazon Simple Cloud Storage Service (S3), those systems may not offer POSIX support or offer fast file transfer performance.

For the best read and write performance and to ensure seamless interoperability between Anzo components, deploy a Network File System (NFS) and mount it in the same location on all of the host servers in the platform. Mounted network file systems offer the best support and performance for reading and writing files.

If you plan to set up Kubernetes (K8s) integration for dynamic deployments of components, an NFS is required. Other file and object stores are not supported for K8s deployments.

Below are guidelines to follow when creating the NFS.

NFS Guidelines

This section describes the key recommendations to follow when creating the NFS for the Anzo platform:

  • Use NFS Version 4 or later.
  • Provision SSD disk types for the best performance.
  • For integration between components and appropriate file ownership, it is important to create the NFS with the same service user account as the other components. For more information, see Platform Service User Account Requirements.
  • When determining the size of the NFS, consider your workload and use cases. There needs to be enough storage space available for any source data files, exported RDF files, Elasticsearch indexes, and any other files that you plan to store on the NFS.

    Cloud-based NFS servers often have better performance if you over-provision resources. When using a cloud-based VM for the NFS, it may be beneficial to provision more CPU, disk space, and RAM than required.