Adding a Task that Runs an Unstructured Pipeline

Follow the instructions below to add a task that runs an unstructured pipeline.

  1. In the Administration application, expand the Tools menu and click Workflow Manager. Anzo displays the Workflows screen, which lists any existing workflows. For example:

  2. Expand the workflow that you want to add a task to. For example:

  3. Click Add Task. The Create Task dialog box is displayed:

  4. Configure the Task by completing the following fields as needed:
    • Task Type: The drop-down list at the top of the dialog box specifies the type of task to create. Distributed Unstructured Pipeline Load Service is selected by default. Accept the default value.
    • Load Service Name: This field specifies the name for the task.
    • Target Unstructured Pipeline: This field specifies the unstructured pipeline that this task should run. Click the drop-down list and select the desired pipeline.
    • Keep Last N-Datasets: This field specifies the number of file-based linked data sets (FLDS) from this pipeline to retain on disk before deleting the oldest ones.
    • Load Threshold: This field specifies the percentage of the pipeline that must complete successfully for the ingestion to be considered a success.
    • Distributed Unstructured Pipeline Stop Timeout: This field specifies the number of milliseconds to wait for an unstructured pipeline to stop.
    • Distributed Unstructured Pipeline Percent Timeout: This field specifies the number of milliseconds to wait before timing out if there is no change in the percentage of documents processed.
    • Index: This field specifies a numeric value that represents the order in which this task should run in the workflow.
  5. Click Create to add the task to the workflow. For example, the image below shows a workflow with one task.

You can repeat this process to add tasks that run additional unstructured pipelines.

Related Topics