Running an Unstructured Pipeline

This page provides instructions for running an unstructured pipeline.

In the Anzo application, expand the Onboard menu and click Unstructured Data. Anzo displays the Pipeline screen, which lists any existing unstructured pipelines. For example:
Click the name of the pipeline that you want to run. Anzo displays the pipeline Overview screen. For example:
Click Run Pipeline to run the pipeline.

The process can take several minutes to complete. You can click the Progress tab to view details such as the pipeline status, runtime, number of documents processed, and errors. For example:

When the pipeline finishes, this run of the pipeline becomes the Default Edition. The Default Edition always contains the latest successfully published data for all of the jobs in the pipeline. If one or more of the jobs failed, those jobs are excluded from the Default Edition. If you publish the failed jobs at a later date or you create and publish additional jobs in the pipeline, the data from those jobs is also added to the Default Edition. For more information about editions, see Managing Pipeline Editions.

The new data set also becomes available in the Dataset catalog. From the catalog, you can generate graph data profiles and create graphmarts. See Blending Data for next steps.

Running an Unstructured Pipeline

Related Topics