Generating a Graphmart Data Profile

Similar to generating a data profile for a dataset, you can profile a graphmart, which may include multiple datasets. When metrics are generated for graphmarts, Anzo profiles the data that results from all of the enabled layers and reports metrics for the classes and properties in the model as well as statistics about the values for the properties. Generating a graphmart profile helps users perform data discovery, assess the quality of the data, and decide on the types of analytics to perform.

To generate a graphmart data profile, AnzoGraph must be online. If you have dynamic AnzoGraph deployments enabled, AnzoGraph will be provisioned automatically when the profile is generated.

  1. In the Anzo application, expand the Blend menu and click Graphmarts. Anzo displays a list of the existing graphmarts. For example:

  2. On the Graphmarts screen, click the name of the graphmart for which you want to generate metrics. Anzo displays the Overview for that graphmart. For example:

  3. Click the Explore tab.

  4. If necessary, activate the graphmart, and then click the Profile Data button on the left side of the screen. The Data Profile dialog box is displayed:

  5. On the Data Profile screen, choose the Profiling Method to use. The Primary method focuses on data quality type analysis. The Advanced method includes the Primary data quality analytics plus several advanced metrics for deeper data discovery and analysis. For details about the Primary and Advanced metrics, see Data Profiling Metrics.

    To run the Advanced metrics, the optional AnzoGraph C++ extensions and dependencies must be installed. If you use dynamic, K8s-based deployments of AnzoGraph, the extensions are included. If you installed a static AnzoGraph instance with the installer, the C++ extensions are optional and are only installed if yes was specified for the Do you want to install C++ UDXs packaged with AnzoGraph DB? prompt. For information about the C++ dependencies, see Install the Optional C++ Extension Dependencies.

  6. Click Run Profiling to start generating the profile. The process may take several minutes, especially for large volumes of data. You can check the status of the process in the Activity Log. The Activity Log also presents the option to stop the profiling process by clicking Cancel under the progress bar for the task. For example:

  7. Once the profiling is complete, the Profile Metrics panel is expanded on the Explore tab. To populate the panel, click a class or property in the Ontology or a class in the graph view in the middle of the screen. For example:

    You can click the Expand icon () on the right side of the screen to collapse the graph view and expand the metrics view.

For more information about the metrics that are displayed, see Data Profiling Metrics. For information about data profiling for datasets and data sources, see Generating a Dataset Data Profile and Generating a Source Data Profile.