Creating a Schema from an SQL Query

Follow the instructions below to create a schema by writing an SQL query that defines the data to onboard.

You can import or create up to 5 schemas per database data source. To include more than 5 schemas, create another data source for the additional schemas.

For better ETL pipeline performance, it is beneficial to include joins and/or filters in schema queries rather than configuring those operations at the mapping level. For more information, see Performance Considerations for Database ETL Pipelines.

  1. In the Anzo application, expand the Onboard menu and click Structured Data. Anzo displays the Data Sources screen, which lists any existing sources. For example:

  2. Click the database for which you want to create a schema. Anzo displays the Tables tab for the source. For example:

  3. Click the Create Schemas From Query button. Anzo displays the Create Schemas dialog box:

  4. In the Create Schemas dialog box, specify a name for this schema in the Schema Name field.
  5. In the Table Name field, specify a name for the table in the schema that the query will create.
  6. Type the SQL statement in the text box. The statement can include any functionality that the source database supports. Anzo does not validate the SQL.

    If the SQL query requires quotes around values, such as ‘2010-01-01’ or ‘TestValue’, make sure that you use single quotes ('). For example:

    SELECT * FROM Movies WHERE production_day='2021-08-01'

    Including double quotes (") in a schema query results in an error when the query is run.

    The following example creates a schema named employees. A table named all_employees will be created in the schema, and the table will be created from the SQL query:

    SELECT EmployeeID, FirstName, LastName, Title, Salary, BirthDate, HireDate, Region, Country
    FROM northwind.Employees
    WHERE EmployeeID

  7. Click Save to save the query. Anzo creates the new schema and adds it to the list of schemas on the Tables screen. For example:

    You can expand the schema to view its tables. Selecting a row in the schema displays the sample data on the right side of the screen. For example:

  8. If you want to create additional tables in the schema, follow these steps:
    1. Click the menu icon () in the Actions column for the schema name and select Add Table.

      The Create New Table dialog box is displayed.

    2. In the Create New Table dialog box, specify a name for the new table in the Table Name field.
    3. In the Schema Query field, write the SQL query that defines the data for the new table.
    4. Click Save to add the table to the schema and return to the Tables screen.

When you are finished creating schemas, the source data can be onboarded to Anzo. If you use the Spark ETL pipeline workflow, see Ingesting Data Sources via ETL Pipelines. If you do not use Spark, see Directly Loading Data Sources via Graphmarts for next steps.