Connecting to an ETL Engine

The default Anzo installation includes a pre-configured local Spark ETL engine and Sparkler ETL compiler. Sparkler is Cambridge Semantics' SPARQL-driven ETL compiler that supports the ingestion of wide CSV files with a large number of columns and increases performance over the Spark Scala-based compiler in many cases. The topics in this section provide instructions for changing the configuration of the local engines or connecting to an alternate Spark ETL engine or Sparkler compiler.

Note Before you can configure Informatica or Pentaho ETL engines, you must import the appropriate Cambridge Semantics bundle. Contact Cambridge Semantics Support to obtain the bundles.
  1. In the Anzo console, expand the Administration menu and click ETL Engine Config.
  2. On the ETL Engine Config screen, click the Create button and select the type of ETL engine that you want to configure. Anzo displays the Create ETL Engine Config screen. For example:

  3. On the Create ETL Engine Config screen, type a Title and optional Description for the engine. Then click Save. Anzo displays the Details view for the new engine. For example:

  4. The options that appear depend on the type of ETL engine you chose. Enter the required details for the engine. Hover the pointer over an options and click the Edit icon () to modify any of the options. Click the check mark icon () to save changes to an option, or click the X icon () to clear the value for an option.

    Click an engine type below to see a description of the settings for that engine.

When the configuration is complete, Anzo provides this ETL engine as a choice to select when ingesting data and configuring pipelines.