AnzoGraph DB 2.1 Releases

To view the release notes for an AnzoGraph DB 2.1 release, select the version from the list below. The release notes for each version describe the product changes from the previous version.

AnzoGraph DB Version 2.1.8

This section describes the improvements and issues that were fixed in AnzoGraph DB Version 2.1.8.

CVE Found in Geospatial Extension Dependency

A Common Vulnerability and Exposures (CVE) issue was found in FasterXML Jackson Databind, which is used in the AnzoGraph DB Geospatial extension. The program did not properly secure entity expansion. The flaw was a vulnerability to XML external entity (XXE) attacks. The highest threat from the vulnerability was data integrity. AnzoGraph DB Version 2.1.8 uses FasterXML Jackson Databind version 2.11.0, which resolves the CVE.

AnzoGraph DB Version 2.1.7

This section describes the improvements and issues that were fixed in AnzoGraph DB Version 2.1.7.

Support Copy of Property Graphs (RDF*)

In previous versions, if a user copied a labeled property graph to a file using the COPY command, the resulting file included all of the triples but excluded all of the edge properties. Version 2.1.7 resolves the issue so that properties are included when property graphs are copied to files. For more information about copying graphs to files, see Copying Graphs to Files.

Memory Exhausted Error when Parsing Large Query

When an extremely large query (775+ KB) was run, the parser failed to parse the query and returned a "Memory Exhausted" error. Version 2.1.7 resolves the issue by increasing the stack size for the parser.

Initial Queries Failed on 16+ Node Clusters

When a 16-node cluster was deployed, all of the initial internal queries failed and the cluster became unusable. The problem was due to the number of socket descriptors that were acquired for the HTTP endpoint. Version 2.1.7 resolves the issue by changing the method used to acquire socket descriptors.

AnzoGraph DB Version 2.1.6

This section describes the improvements and issues that were fixed in AnzoGraph DB Version 2.1.6.

Query Console Returned "No Records" for DESCRIBE Queries

Version 2.1.6 resolves an issue that prevented results of DESCRIBE queries from being displayed in the Query Console.

Stability Improvements for Low Memory Conditions

Version 2.1.6 includes enhancements that improve the stability of AnzoGraph DB when it is operating in low memory conditions and reduce the likelihood that the AnzoGraph DB process will be terminated by the Linux out of memory killer.

Improved Memory Usage Reporting

In previous versions, the memory statistic that was sent to the client for memory usage reporting was the RSS value instead of the internal AnzoGraph DB usage value. This resulted in the Admin Console displaying a significantly higher memory usage value than what AnzoGraph DB was actually using. Version 2.1.6 corrects the issue for more accurate memory usage reporting.

Failed to Load Files in HDFS Subdirectories

In a previous version, when files were loaded to AnzoGraph DB from HDFS, only the files from the parent directory specified in the LOAD command were loaded. Files in child directories under the parent were ignored. Version 2.1.6 resolves the issue so that loads from HDFS include the files in subdirectories under the parent directory.

AnzoGraph DB Version 2.1.5

This section describes the improvements and issues that were fixed in AnzoGraph DB Version 2.1.5.

Memory Allocator Failed to Release Temporary Memory

As queries were executed against data in AnzoGraph DB, the memory allocator was accumulating memory but could fail to release all of it back to the operating system. This resulted in memory usage steadily increasing even though new data was not added. In some cases, AnzoGraph DB retained all of the available memory and was shut down by the operating system. Version 2.1.5 resolves the issue to ensure that the allocator releases memory when it is finished using it.

AnzoGraph DB Version 2.1.4

This section describes the improvements and issues that were fixed in AnzoGraph DB Version 2.1.4.

Inconsistent Results for DELETE Followed by INSERT

When a DELETE query was immediately followed by an INSERT query and the INSERT contained some of the same triples as the DELETE, the number of triples that were inserted could be inconsistent across runs. Version 2.1.4 fixes the issue by changing the way AnzoGraph DB handles the intermediate results produced by the WHERE clause in update queries. Previously the WHERE clause results were streamed. In Version 2.1.4, the results are materialized before further processing.

Materializing the intermediate results increases the amount of memory that is used for performing updates. Depending on the size of the result set produced by the WHERE clause, you may notice that INSERT and DELETE queries temporarily use more memory than they did before.

Inconsistent Results Across Environments

Version 2.1.4 resolves an issue where inconsistent results were returned when the same query was run on different clusters. One cluster returned a larger result set than the other. The issue occurred because on the environment where OWL statistics were enabled, a variable was incorrectly tagged as unique when it was not. Since it was considered unique, the DISTINCT operation was skipped in the COUNT calculation and too many results were returned.

Slower than Expected INSERT Performance

When comparing performance between a single server setup and cluster with the same total number of CPU, an INSERT query ran much slower on the single server. The slower performance was due to a mutex contention. Version 2.1.4 resolves the issue by reducing the time spent waiting on mutexes.

AnzoGraph DB Version 2.1.3

This section describes the improvements and issues that were fixed in AnzoGraph DB Version 2.1.3.

Enable Extensions Directory by Default

In Version 2.1.3, AnzoGraph DB is configured by default to scan and load to the database any extensions in the <install_path>/lib/udx directory. In previous versions, users needed to edit the configuration file (<install_path>/config/settings.conf) to set the extensions_dir value and enable extensions. If you want to load extensions from a different directory, edit the extensions_dir value in settings.conf. For more information, see Changing System Settings.

Query with Property Path Returned Incorrect Results

In certain circumstances, a property path query could return extra results or fail with an error. The problem occurred because it was possible for the planning stage of the query to distribute the incorrect column to the execution engine. Version 2.1.3 corrects the query plan for property path queries.

Crash after Out of Memory Error

When AnzoGraph DB canceled a query because there was insufficient memory available, there was a circumstance where AnzoGraph DB could try to free the same memory resource twice. The double-free of resources resulted in a crash. Version 2.1.3 resolves the issue by ensuring that AnzoGraph DB does not try to release the same resource more than once.

AnzoGraph DB Version 2.1.2

This section describes the improvements and issues that were fixed in AnzoGraph DB Version 2.1.2.

ASK Returns False if Graph does not Exist

In previous versions, ASK queries resulted in a "No such graph or view" or "Bad request" error if the query referenced a graph that did not exist. In Version 2.1.2, AnzoGraph DB returns "false" for ASK queries that reference non-existent graphs.

Option to Return Empty Result when Graph does not Exist

In previous versions, AnzoGraph DB returned a "No such graph or view" error and aborted the query if a query referenced a graph that did not exist. The system could not be configured to return an empty result instead of an error. In Version 2.1.2, referencing a non-existent graph still produces an error by default, but users have the option to configure AnzoGraph DB to return an empty result instead. To configure the system to return empty results instead of an error when a referenced graph does not exist, follow the instructions below to set the enable_unbound_variables value to true:

  1. Open install_path/config/settings.conf in a text editor.
  2. Locate the following line in the file:
    #enable_unbound_variables=false
  3. Uncomment the line and change the false value to true.
    enable_unbound_variables=true
  4. Save and close the file, and then restart AnzoGraph DB to apply the configuration change. For more information about changing settings, see Changing System Settings.

In addition to allowing queries that reference non-existent graphs to succeed, setting enable_unbound_variables to true also configures AnzoGraph DB to ignore unbound variables elsewhere in queries. For example, by default (when enable_unbound_variables=false), if a query includes a variable in the SELECT list that is not referenced in a WHERE clause pattern, the query is aborted and returns a "Named variable not in contained WHERE clause" error. When enable_unbound_variables=true, AnzoGraph DB will not warn the user about unbound variables. Instead, the results will be empty for the unbound variable. For example:

SELECT ?unbound ?person ?name
FROM <http://csi.com/people>
WHERE {?person <http://csi.com/people#firstname> ?name}
LIMIT 5
 unbound | person      | name
---------+-------------+---------
	  | person35632 | Ross
         | person20216 | Quin
         | person35859 | Kellie
	  | person2551  | Maris
	  | person24963 | Madonna
5 rows

Added Support for RDF Graph Store HTTP Protocol

Version 2.1.2 adds support for the SPARQL 1.1 Graph Store HTTP Protocol via the new rdf-graph-store endpoint. The AnzoGraph DB front end also supports the new graph store protocol via the data endpoint. The graph store protocol supports GET, POST, UPDATE, and DELETE HTTP methods. For more information, see Accessing the SPARQL and RDF Endpoints.

Known Issue

While AnzoGraph DB now supports the Graph Store HTTP Protocol, it is a known issue that all return codes are not in accordance with the W3C recommendation. The following exceptions will be addressed in a later release:

  • If a GET request includes an invalid Accept value, the specification states that "406 Not Acceptable" should be returned. In Version 2.1.2, AnzoGraph DB returns "400 Bad Request."
  • For HTTP DELETE operations, the specification states that "404 Not Found" should be returned if a user deletes a graph that does not exist. In Version 2.1.2, AnzoGraph DB returns "400 Bad Request."
  • If a PUT or POST request includes an invalid Content-Type, the specification states that "415 Unsupported Media Type" should be returned. In Version 2.1.2, AnzoGraph DB returns "200 ok."

Unable to Recover from Out of Memory Exception

Running several complex queries concurrently could cause AnzoGraph DB to run out of memory and then fail to recover the memory after canceling the queries. The issue occurred when AnzoGraph DB was in the process of initializing the next query as the current query was receiving the out of memory exception. Version 2.1.2 resolves the issue to ensure that memory is recovered if a new query is started while another query hits an out of memory exception.

Crash if User without Write Permission Ran Graph 500 Benchmark

If a user ran the AnzoGraph DB Graph 500 Benchmark and did not have write permission to the directory where the benchmark generated data files, AnzoGraph DB crashed instead of returning a permission denied error to the user. In Version 2.1.2, AnzoGraph DB returns an error if the user does not have access to the directory where the data is generated, and the database remains running.

AnzoGraph DB Version 2.1.1

This section describes the improvements and issues that were fixed in AnzoGraph DB Version 2.1.1.

Retrieve boot.log when Generating Xrays

In previous versions, taking an xray did not retrieve the AnzoGraph DB boot.log diagnostic file (<install_path>/internal/log/boot.log), which includes information that is logged when the database is started and stopped. The boot.log was only retrieved when a crashdump was generated. In Version 2.1.1, boot.log is included in an xray as well as a crashdump.

Invalid Content-Type Response for Construct Queries

In previous versions, AnzoGraph DB sent the following invalid Content-Type HTTP header value in response to a CONSTRUCT query: application/sparql-results+text/plain. In Version 2.1.1, AnzoGraph DB sends an accurate Content-Type of "text/turtle; charset=utf-8."

Incorrect Results for Query with Optional Clause and Filter on Bind Clause Variable

In previous versions, a query that had an optional clause and a bind statement returned incorrect results when it also included a filter expression on the variable from the bind statement. Below is an example query that meets the criteria.

select ?name ?age ?phone
where { 
  ?v <name> ?name .
  ?v <age> ?age .
  optional { ?v <phone> ?phone }
  bind (?age as ?new_age)
  filter (?phone = 1234 && ?new_age > 10)
}

This query returned too many results because both of the filter expressions were not applied. Version 2.1.1 resolves the issue so that queries like the example above return the correct results.

Diagnostic Files Could Get Corrupted

In Version 2.1.0, there was a circumstance where a crashdump could be corrupted when multiple threads tried to create the diagnostic files at the same time. Version 2.1.1 resolves the issue to ensure that only one thread creates the crashdump.

AnzoGraph DB Version 2.1.0

This section describes the new features and changes to existing components that are introduced in AnzoGraph DB Version 2.1.0.

New Features

Improvements to Existing Features

Other Changes and Fixes

Preview of New Data Science Functions

AnzoGraph Version 2.1.0 provides an additional collection of new Data Science and statistical functions that supplement the analytic functions previously available in AnzoGraph DB. These new functions provide new distribution probability, correlation, profiling, and frequency calculations such as those commonly used for statistical analysis and machine learning. For more information, see Geospatial Functions.

C++ and Java/JVM User-Defined Extension (UDX) Support

AnzoGraph Version 2.1.0 adds new Java/JVM UDX support and enhanced C++ UDX capabilities. New documentation has been added for C++ user-defined function and aggregate extension development. Work is in progress on documentation for Java/JVM UDX development and user-defined service extensions. Contact Cambridge Semantics Support if you require more immediate technical assistance with UDX development projects.

Improved Memory Allocation Performance

In Version 2.1.0, AnzoGraph DB manages memory using a thread caching allocation strategy. The strategy is implemented with the industry standard tcmalloc library from Google. When compared to previous AnzoGraph DB versions, the thread caching allocator typically decreases lock contention time and overall query run time.

Default Configuration Values Commented Out in Settings File

To help distinguish between the default AnzoGraph DB configuration values and custom values in the configuration file <install_path>/config/settings.conf, Version 2.1.0 comments out the settings that are set to the default AnzoGraph DB values. When changing settings, uncomment a setting/value pair and modify the value as needed. For more information about changing settings, see Changing System Settings.

Ability to Configure the System Management Port (5600)

In previous versions, the port for the system management daemon (azgmgrd) was set to 5600 and could not be changed. If an environment could not use that port, users had to remember to specify -port <alternate_port> any time azgmgrd was started or an azgctl command was run. In Version 2.1.0, the <install_path>/config/settings.conf file includes a sysmgr_port setting. The value is set to 5600 by default. To use a port other than 5600, uncomment the sysmgr_port setting and change the value to the desired port.

Changing sysmgr_port requires a restart of the system management daemon, azgmgrd, as well as the database.

Beta Release of Database Auto-Restart Feature

Version 2.1.0 includes a beta release of the database auto-restart feature. When the feature is enabled, the AnzoGraph DB system manager automatically restarts the database after a crash. The feature is disabled by default, and Cambridge Semantics recommends that you enable it only on test systems for the 2.1.0 release. The feature is controlled by the following two new settings in <install_path>/config/settings.conf:

  • auto_restart_max_attempts=0: This setting specifies the number of times the system manager should attempt to start the database after a crash. The default value of 0 disables auto-restart.
  • auto_restart_time=600: This setting specifies the number of seconds to spend attempting to restart the database. If all attempts fail and this time limit is reached, the system manager stops trying to restart the database.

Changing the auto_restart settings requires a restart of the system management daemon, azgmgrd, as well as the database.

Reduced Initial Query Compilation Time

In previous versions, each time a new query was run, AnzoGraph DB performed an extensive code compilation process to generate the most optimal code for running that query. Once the code compilation was complete, AnzoGraph DB executed the query using that code. In Version 2.1.0, when a new query is run, AnzoGraph DB compiles basic, non-optimized code and immediately executes the query using that code. The optimized compilation process continues in the background, and the optimized code is used for subsequent runs of the query. In most cases, this change reduces the execution time for the first run of a query.

In Version 2.1.0, the compile_optimized configuration setting was changed from true to background.

Enable OWL Statistics by Default

In order to generate query execution plans, AnzoGraph DB needs to gather statistics about the data. In previous versions, AnzoGraph DB captured basic statistics, such as the number of triples per graph and number of distinct subjects and predicates, by default. To aid in generating more optimal plans, users could enable more extensive statistics gathering, called OWL stats, which uses the metadata from data models to generate statistics. In Version 2.1.0, OWL stats are enabled by default. The feature is controlled by the enable_owlstats setting in <install_path>/config/settings.conf. When enable_owlstats is false (disabled), AnzoGraph DB reverts to capturing basic statistics.

Support Expressions with Median and Percentile Functions

Version 2.1.0 adds support for including expressions as arguments to MEDIAN and PERCENTILE functions. Previously AnzoGraph DB displayed an "Invalid System State" error message if an expression was input to one of those functions.

Display Error if Namespace Bindings Missing in INSERT

In previous versions, AnzoGraph DB did not produce an error if an INSERT query included triple patterns where some elements excluded namespace prefix bindings. SELECT queries with the same patterns, however, did correctly result in an error. In Version 2.1.0 INSERT queries with undefined prefixes result in the appropriate error. For example, the following query

INSERT DATA { 
  :John a :Person ; 
  :name "John Doe" .
}

Results in an error such as:

:John: URI has blank prefix, but no (PREFIX or BASE) namespace defined

Project Expressions in SELECT DISTINCT Clause

In previous versions, if a query had a SELECT DISTINCT clause with expressions that included functions (such as STR and STRAFTER) and the query did not have a GROUP BY clause, AnzoGraph DB could fail to project results and return an "Invalid System State" error. Version 2.1.0 fixes the issue so that AnzoGraph DB properly projects SELECT DISTINCT expressions that contain functions.

Display Error for Unbound Variables in WHERE Clause

In previous versions, if an INSERT query had unbound variables in the WHERE clause, AnzoGraph DB could incorrectly insert additional triples for the unbound variable instead of discarding it. In Version 2.1.0, by default AnzoGraph DB displays an error message if an INSERT query includes unbound variables. To change the behavior so that AnzoGraph DB discards unbound variables and inserts only the bound values, edit the <install_path>/config/settings.conf file to change the value of enable_unbound_variables from false to true.

Ensure Persistence Reference Counts are Valid

In previous versions, the persisted data on disk could become invalid because the reference counts on the data blocks could change between the save of the data to disk and the CRC calculation of the same data. Version 2.1.0 resolves the issue by ensuring that reference counts do not change between the save operation and the CRC calculation.

Support Limits in Subqueries

In previous versions, AnzoGraph DB returned an "Invalid System State" error if a subquery included a LIMIT clause. Version 2.1.0 resolves the issue to ensure that queries that include subqueries with LIMIT clauses can complete successfully.

Return an Error for a Query with an Empty Where Clause

In previous versions, AnzoGraph DB crashed if a query with an empty WHERE clause was run. Version 2.1.0 resolves the issue so that AnzoGraph DB returns an error message and remains online if a user runs a query with an empty WHERE clause.