Importing Data from Databases

This topic provides instructions for connecting to a structured data source, such as a Microsoft, Oracle, Hadoop, Teradata, PostgreSQL, or Google database, and importing a predefined schema or writing a schema query that dynamically defines the data to onboard.

For information about onboarding data from databases incrementally, see Onboarding Data from a Database Incrementally.

Connecting to a Database

  1. In the Anzo console, expand the Onboard menu and click Structured Data. Anzo displays the Data Sources screen, which lists any existing data sources. For example:

  2. Click the Create button and select Database Data Source. Anzo opens the Create Database Data Source screen.

  3. At the top of the screen, type a Title for the source.
  4. Type an optional Description for the source.
  5. Click the Type field and select the database type from the drop-down list. Depending on the type you choose, Anzo displays additional fields to complete.
  6. Enter any additional details and the credentials that are required for making the source connection. The options that appear depend on the type of database that you chose:
    • User: Type the user name used to log in to the database.
    • Password and Password Repeat: Type the password for the user.
    • Server: Type the server name or IP address for the source. Include the port if necessary.
    • Database: If necessary, type the partition that contains the data.
    • Extended Properties: For Hadoop Hive or Impala databases, enter the extended attributes that you use.
  7. Click Save to save the data source connection. Anzo tests the connectivity and displays the Overview screen. If the connection fails, adjust the data source details as needed.

After connecting to the data source, the next step is to define the schema that Anzo will use to import the data and determine the data's structure. To import a schema that is defined in the database, follow the instructions in Importing a Predefined Schema below. For instructions on writing a query to dynamically define the schema, see Creating a Schema from an SQL Query.

Importing a Predefined Schema

Follow the steps below to import a predefined schema from the database to Anzo. For instructions on writing a schema query, see Creating a Schema from an SQL Query below.

  1. From the Overview screen, click the Schema tab. Anzo displays the Schema screen. For example:

  2. Click the Import Schemas button. Anzo displays the Import Schemas dialog box. For example:

    If you do not see a schema that you expect to see, make sure that you have the necessary access to the data source.

  3. Select the checkbox next to each schema that you want to import, and then click OK. Anzo imports the selected schema and adds it to the list of schemas on the screen. For example:

Once the schema or schemas are imported, the source data in the database can be onboarded to Anzo. For instructions on onboarding the data by letting Anzo automatically generate the mapping, model, and ETL pipeline, see Auto-Ingesting Imported Data. For information about manually creating mappings, models, and pipelines, see Working with Mappings, Modeling Data, and Working with Pipelines.

Creating a Schema from an SQL Query

Follow the instructions below to create a schema by writing an SQL query that defines the data to onboard.

  1. From the Overview screen, click the Schema tab. Anzo displays the Schema screen. For example:

  2. Click the Create Schemas From Query button. Anzo displays the Create Schemas dialog box:

  3. In the Create Schemas dialog box, specify a name for the schema in the Schema Name field.
  4. In the Table Name field, specify a name for this schema table.
  5. Type the SQL statement in the text box. The statement can include any functionality that the source database supports. Anzo does not validate the SQL. For information about writing a schema query that onboards data from a database incrementally, see Onboarding Data from a Database Incrementally.
  6. Click Save to save the query. Anzo creates the new schema and adds it to the list of schemas on the screen. For example:

Once the schema is created, the source data in the database can be onboarded to Anzo. For instructions on onboarding the data by letting Anzo automatically generate the mapping, model, and ETL pipeline, see Auto-Ingesting Imported Data. For information about manually creating mappings, models, and pipelines, see Working with Mappings, Modeling Data, and Working with Pipelines.

Related Topics