Sizing Guidelines for Disk-Based Storage
For fast performance and scalability, AnzoGraph DB stores all data in memory. Data is persisted to disk as a backup and so that graphs are automatically reloaded into memory when AnzoGraph DB is restarted, but queries do not access the data on disk since all of the data is cached in memory. And accessing data in memory is much faster than retrieving data from disk.
When deploying large memory-optimized servers for fast query performance is not feasible, however, AnzoGraph DB can be configured to operate as a disk-based graph database. In this configuration (called "Paged Data"), data is loaded to AnzoGraph DB, converted to AnzoGraph DB's internal storage format, and persisted to disk without being retained in memory. Data is then paged into memory from disk as requested for analytic operations. For details about database operations in paged data mode, see Enabling Paged Data Mode (Preview).
The table below lists the disk and memory sizing requirements and guidelines to follow if you are considering enabling disk-based storage. For software and firewall requirements, see Server and Cluster Requirements.
Hardware Requirements
Component | Recommendation | Guidelines |
---|---|---|
RAM | 100+ GB |
|
Disk Size | 500+ GB | The disk size should be at least 4X the size of the data at rest. For example, loading 1 TB of data requires a 4 TB disk to support paging operations. |
Disk Type | SSD | The speed of the disk that hosts the persisted data has an impact on query performance. For the best performance, store the persistence directory on a fast disk, such as SSD. You can relocate the default persistence directory from the AnzoGraph DB file system to a separate location. See Relocating AnzoGraph DB Directories for more information. Cambridge Semantics recommends that you do not store persisted data on a NFS mounted disk due to the network overhead that is introduced. |
CPU | 32 | A greater number of multi-core CPU with a high clock speed can make a dramatic difference in the performance of paged data queries. Intel processors are preferred, but AnzoGraph DB is supported on newer Epyc AMD processors. Older AMD processors are not supported. |
Ultimately, queries perform significantly slower when data is stored on disk versus in memory. If fast performance is a requirement, data should be stored in-memory, and configuring AnzoGraph DB for paged data operations should not be considered. For more information, see Enabling Paged Data Mode (Preview).