Using Data Dictionaries

Metadata dictionaries are similar to data models in that they define the desired business meaning and structure of the data after it is onboarded to Anzo and converted to the graph model. Unlike data models, though, metadata dictionaries offer maximum flexibility for normalizing the data that comes from various sources and structures. A single dictionary can be used to link conceptually identical elements (columns) from many different data source schemas, independent of any models and mappings. The metadata dictionary structure becomes the basis for creating and reusing models and mappings. As models and mappings are generated, deleted, and recreated over time, the growing body of information about business meaning and the concepts that link source schema elements to properties in the model remain available in the data dictionaries.

This topic provides instructions for creating and managing data dictionaries.

Creating a Metadata Dictionary from a Schema

Follow the instructions below to create a new metadata dictionary from a schema.

Note The steps below start with viewing a schema and then adding that schema to a new dictionary. That method allows for flexibility in choosing which schema tables are added to the dictionary. However, you can also create a data dictionary first and then add an entire schema to it. To do so, select Metadata Hub from the Onboard menu. On the Dictionaries screen, click the Create button and select From Schema. In the Create Metadata Dictionary dialog box, select the schema to add to the new dictionary.
  1. In the Anzo console, expand the Onboard menu and click Structured Data. Anzo displays the Data Sources screen, which lists the available data sources. For example:

  2. On the Data Sources screen, click the name of the data source for which you want to create a data dictionary. Anzo displays the Tables screen for the source. For example:

  3. Click the Add To Dictionary button. If the source has more than one schema, Anzo displays the select schema dialog box. In the drop-down list, select the schema to add to the dictionary, and then click OK. For example:

    Anzo opens the Create Metadata Dictionary From This Schema dialog box.

  4. In the dialog box, leave the Create New radio button selected.
  5. Enter a name for the dictionary in the Title field and specify an optional description in the Description field.
  6. If you want to include all of the schema tables, select the Select all tables radio button. If you want to include a subset of tables, click the Custom select radio button and then select each of the tables to add.
  7. Click Save. Anzo creates the new dictionary and displays the Concepts tab, which lists the class and properties that are derived from the schema. For example:

  8. Click a row in the list of concepts on the left to view the concept details on the right side of the screen. Click the < character in the table to expand a class concept and view its property concepts. For example:

Create and edit concepts as needed. See Defining Concepts in a Metadata Dictionary below for information about working with concepts.

Creating a Metadata Dictionary from Scratch

Follow the instructions below to create a new metadata dictionary from scratch.

  1. In the Anzo console, expand the Onboard menu and click Data Dictionaries. Anzo displays the Dictionaries screen. For example:

  2. Click the Create button at the top of the screen, and select Manual. Anzo displays the Create Metadata Dictionary dialog box.

  3. Type a name for the dictionary in the Title field and supply an optional description in the Description field.
  4. Click Save to create the new dictionary. Anzo saves the dictionary and displays the empty Concepts tab. For example:

Create and edit concepts as needed. See Defining Concepts in a Metadata Dictionary below for information about working with concepts.

Defining Concepts in a Metadata Dictionary

This section provides examples and instructions for defining the concepts in a data dictionary.

Merging Concepts

It is common to encounter schemas where the same concept is conveyed using different names and whose properties are shared across tables in the data source. For example, the concept list below has a "Customers" class, a "CustomerDemographics" class, and a class called "CustomerCustomerDemo."

The three customer concepts share properties such as CustomerID and CustomerTypeID, which are foreign key relationships across the tables/classes. The three classes that share the same concept, customer-related data, can be merged into a single concept, creating one class in the model that contains all of the customer-related properties.

Note: Modifications that you make to a data dictionary do not change the source schema.

To merge concepts

  1. Select the checkbox next to each concept that you want to merge, and then click the Merge Concepts button on the right side of the screen. For example:

    Anzo displays the Merge Concepts dialog box, which lists the classes to merge and enables you to specify the title and description of the new, merged class. For example:

  2. On the Merge Concepts screen, if you want to name the merged class with one of the existing class names, select the checkbox next to that class. The Title field on the right is populated with that name and you have the option to edit it. If you do not want to use any existing titles, type a new title in the Title field.
  3. In the Definition field, type an optional description for the class. For example:

  4. Click Save to merge the concepts. Anzo displays a confirmation dialog box that lists the concepts that will be merged and asks if you want to proceed. Click OK to complete the merge.
  5. When the merge is complete, the concept list is displayed with the changes. You can select the merged class to view and modify concept details on the right side of the screen. For example, the image below shows the details for the merged "Customers" concept. The names of the concepts that were merged to "Customers" are listed in the Alternate field. Sources that include those labels, "CustomerDemographics" and "CustomerCustomerDemo," will be mapped to "Customers" in the model. You can edit the Alternate field to add other labels that might come from future source schemas.

  6. From the concept list you can select, edit, and remove classes or properties. For example, since foreign keys were present in the classes that were merged in the sample above, the foreign keys can be removed.

Creating a New Concept

Follow the instructions below to create a new class or property concept in a data dictionary.

  1. To add a new concept, click the Create button on the right side of the screen. Anzo displays the Create New Concept screen.

  2. Under New Concept Type, select the radio button for the type of concept to create:
    • Data Property: A data property has an object that is a literal value. For example, a property like FirstName is a data property. Its object has a value such as "Jane."
    • Object Property: An object property has an object that relates a class to another class. These types of relationships are usually foreign keys in the source. For example, a property like CustomerID might relate the Customers class to the Orders class.
    • Class: A class concept contains a group of related properties, such as a table name from a source schema.
  3. Depending on the type of concept you are creating, specify the appropriate required and optional details:
    • Title: The name for this class or property concept.
    • Definition: An optional description for the new concept.
    • Alternate: An optional list of labels that should map to this new class or property concept.
    • Hidden: An optional list of labels that should be hidden in the data model that is generated from this dictionary.
    • Range: For property concepts, this required field specifies the data type for the property.
    • Class: For property concepts, this required field lists the class or classes the property belongs to.

    For example, the image below creates a data property for reviews of orders. The new property is named ReviewText and "Comment," "Comments," and "Review" are included as Alternate labels so that those properties in source schemas are mapped to ReviewText in the model when the data is onboarded.

  4. Click Save to add the new concept to the dictionary.

Splitting a Concept

If you determine that one concept should be separated into multiple concepts, you can quickly split the concept and create an additional one by moving any of the original concept's elements to a new concept. Follow the instructions below to split a concept.

  1. In the list of concepts, select the row for concept that you want to split and then click the Split button in the Concept Details. Anzo displays the Split Concept screen, which lists the original concept on the left and the new concept on the right. For example:

  2. Under Split Concept, type a name for the new concept in the Title field.
  3. For the rest of the fields, you can drag elements from the Original Concept to the Split Concept. For example, the image below creates a new Delays class concept and moves the delay-related properties from the original concept to the new concept.

  4. When you are finished configuring the new concept, click Save. Anzo displays a confirmation dialog box that lists the concepts that will be split and asks if you want to proceed. Click OK to complete the split and return to the Concepts screen.

For instructions on onboarding data using a data dictionary, see Ingesting Data with a Data Dictionary.

Related Topics