Introduction to Editions

Editions are collections of the Data Components that are published by a given Pipeline. Editions can be assembled by users and can include any subset of the jobs from a Pipeline and any version of a job's output. This topic introduces you to the concepts that are helpful to know when working with Editions.

What is a Data Component?

A Data Component is the data that is generated by one successful run of a job in a pipeline. Each time a job runs to completion, a new Data Component is created that contains the version of the data that was generated by that run. If a job is run 5 times, there are 5 Data Components. Anzo preserves each version of the data that is output by each job.

For example, the image below shows a list of the jobs in an Edition. The right side of the screen shows that the selected job has been successfully published three times:

The Data Component from the most recent run of the selected job is automatically included in the Managed Edition (see What is the Managed Edition?), and any of the three Data Components could be added to a Saved Edition (see What is a Saved Edition?).

What is the Managed Edition?

When a Pipeline is published, the result of the most recent run becomes the Managed Edition. This Edition is managed by Anzo and always contains the most recent successfully published Data Components for all of the jobs in the Pipeline. If one or more of the jobs fail, those jobs are excluded from the Managed Edition. If the failed jobs are published later or additional jobs are created and published, the data that results from those jobs gets added to the Managed Edition.

For example, the image below shows the Managed Edition for a Dataset. Editions are viewed from a Dataset's Overview tab. The same view is available on the Overview tab for the Pipeline.

Note that the Title of the Managed Edition in the image is Default Edition. The Title of your Managed Edition may vary, depending on whether the Edition was created by publishing a new structured pipeline (as is the case in the example) or whether it resulted from an unstructured pipeline or an Anzo upgrade where the Dataset from the previous Anzo version was converted to an Edition in the new version. The Title for an Edition that was converted during an upgrade is in the form of <dataset_name> working edition.

The Managed Edition cannot be changed, but it can be cloned (via the Actions menu) and saved as a Saved Edition. Saved Editions can be modified. See What is a Saved Edition? below.

What is a Saved Edition?

A Saved Edition is a user-assembled collection of Data Components from a Pipeline. A Saved Edition can contain any combination of jobs and any version of a job's Data Components. Saved Editions can be created from scratch or can be cloned from the Managed Edition or another Saved Edition.

The Managed Edition or any Saved Edition can be added to a Graphmart for analysis.

Related Topics