Adding a Task that Runs an Unstructured Pipeline
Follow the instructions below to add a task that runs an unstructured pipeline.
- In the Administration application, expand the Tools menu and click Workflow Manager. Anzo displays the Workflows screen, which lists any existing workflows. For example:
- Expand the workflow that you want to add a task to. For example:
- Click Add Task. The Create Task dialog box is displayed:
- Configure the Task by completing the following fields as needed:
- Task Type: The drop-down list at the top of the dialog box specifies the type of task to create. Distributed Unstructured Pipeline Load Service is selected by default. Accept the default value.
- Load Service Name: This field specifies the name for the task.
- Target Unstructured Pipeline: This field specifies the unstructured pipeline that this task should run. Click the drop-down list and select the desired pipeline.
- Keep Last N-Datasets: This field specifies the number of file-based linked data sets (FLDS) from this pipeline to retain on disk before deleting the oldest ones.
- Load Threshold: This field specifies the percentage of the pipeline that must complete successfully for the ingestion to be considered a success.
- Distributed Unstructured Pipeline Stop Timeout: This field specifies the number of milliseconds to wait for an unstructured pipeline to stop.
- Distributed Unstructured Pipeline Percent Timeout: This field specifies the number of milliseconds to wait before timing out if there is no change in the percentage of documents processed.
- Index: This field specifies a numeric value that represents the order in which this task should run in the workflow.
- Click Create to add the task to the workflow. For example, the image below shows a workflow with one task.
You can repeat this process to add tasks that run additional unstructured pipelines.