Ingesting Data Sources via ETL Pipelines

The topics in this section provide instructions for ingesting data from structured data sources using the Ingest ETL pipeline process. The Ingest workflow automatically generates a Model, Mappings, and an ETL Pipeline when you ingest a data source for the first time. If the schema changes and the pipeline components need to be updated, you can configure subsequent Ingest workflows to reuse and update the existing components or regenerate them.

If the source data is updated but the schema does not change, or if the model or mappings are modified and the schema is not affected, you do not need to re-ingest the source using the Ingest workflow. You can simply republish the pipeline or the affected jobs in the pipeline. See Publishing a Pipeline or Subset of Jobs for more information.

The way you configure the Ingest workflow depends on whether you are ingesting a Data Source for the first time, are re-ingesting a Data Source because the Schema changed, or whether the source has an associated Metadata Dictionary. Select the appropriate instructions below for guidance on configuring the initial Ingest workflow, a subsequent workflow, or a workflow with a Metadata Dictionary: