Deploying an Anzo Unstructured Cluster

This topic provides instructions for deploying an Anzo Unstructured pipeline cluster. See Anzo Unstructured Requirements and Recommendations for information about cluster requirements.

  1. Deploy the Leader Node
  2. Deploy the Worker Nodes
  3. Anzo Unstructured Cluster Configuration

Deploy the Leader Node

Follow the instructions below to deploy the Anzo Unstructured leader node.

  1. Copy the Anzo DU installation script to the leader host server and then run the following command to make the script executable:
    chmod +x script_name
  2. Run the following command to start the installation wizard:
    ./script_name

    The script unpacks the JRE and then waits for input before starting the installation.

  3. Press Enter to start the installation.
  4. Review the software license agreement. Press Enter to scroll through the terms. At the end of the agreement, type 1 to accept the terms or type 2 to disagree and stop the installation.
  5. At the prompt that asks which components to install, type 1 (Leader) and then press Enter.
  6. Specify the directory to install Anzo DU. Press Enter to accept the default installation path or type an alternate path and then press Enter.
  7. The wizard prompts for the IP address of this leader instance. The wizard defaults to the IP address of the server. Press Enter to accept the default value. If necessary, type a different IP address, and then press Enter.
  8. The wizard prompts for any additional leader node IP addresses. Typically there is one leader node and this value is specified as the same IP address as the previous step. If you set up additional leader nodes for redundancy, however, enter a comma separated list of the alternate nodes. Otherwise, accept the default value and press Enter.
  9. Specify the maximum amount of memory (in MB) that this leader instance can use. The install wizard lists the total RAM available and chooses 1/2 of the total memory as the default value. Adjust the value as needed or accept the default value and then press Enter.

    The wizard proceeds to install Anzo DU according to the values that you specified.

  10. When the installation is complete, run the following command to start the leader instance:
    ./install_path/AnzoDU_root_dir/leader start

    For example:

    ./opt/AnzoDU/leader start

Deploy the Worker Nodes

Follow the instructions below to deploy the Anzo Distributed Unstructured worker nodes.

  1. Make sure that the worker host servers have access to the Anzo shared file system and meet the requirements in Anzo Unstructured Cluster Requirements and Recommendations.
  2. Copy the Anzo DU installation script to each of the worker host servers and then run the following command to make the script executable:
    chmod +x script_name
  3. Run the following command to start the installation wizard:
    ./script_name

    The script unpacks the JRE and then waits for input before starting the installation.

  4. Press Enter to start the installation.
  5. Review the software license agreement. Press Enter to scroll through the terms. At the end of the agreement, type 1 to accept the terms or type 2 to disagree and stop the installation.
  6. At the prompt that asks which components to install, type 2 (Worker) and then press Enter.
  7. Specify the directory to install Anzo DU. Press Enter to accept the default installation path or type an alternate path and then press Enter.
  8. The wizard prompts for the IP address to use for this worker node. The wizard defaults to the IP address of the server. Press Enter to accept the default value. If necessary, type a different IP address, and then press Enter.
  9. The wizard prompts you to specify the maximum number of service instances for this worker node. Each service instance processes one unstructured document at a time. The default value is 2 instances. Press Enter to accept the default or specify another value and then press Enter.
  10. Specify the port to use for this worker. The wizard defaults to port 2552. Press Enter to accept the default value or type a different port and then press Enter.
  11. The wizard prompts you to enter the IP address of the leader node. Specify the IP address for the leader instance that you deployed in the procedure above. If you deployed multiple leader nodes, specify each leader's IP address in a comma separated list.
  12. Specify the maximum amount of memory (in MB) that this worker instance can use. The install wizard lists the total RAM available and chooses 1/2 of the total memory as the default value. Adjust the value as needed or accept the default value and then press Enter.

    The wizard proceeds to install Anzo DU according to the values that you specified.

  13. When the installation is complete, run the following command to start the worker instance:
    ./install_path/AnzoDU_root_dir/worker start

    For example:

    ./opt/AnzoDU/worker start
  14. Repeat the steps above for each worker instance in the cluster.

Anzo Unstructured Cluster Configuration

You do not need to perform additional configuration after the initial deployment of an Anzo Unstructured cluster. To review the configuration that Anzo creates based on the values specified during the installation, view the Distributed Pipeline options in Server Settings in the Administration menu. For more information, see Changing the Distributed Pipeline Configuration.

Important: Any time the AU leader instance is restarted, the following two services must be restarted in Anzo:

  • Anzo Server Akka Cluster Integration
  • Anzo Unstructured Distributed

To restart a service:

  1. Log in to the Anzo console, expand the Administration menu, and click Advanced Configuration.
  2. On the Advanced Configuration screen, click the I understand and accept the risk link to view the Anzo bundles.
  3. In the Search field at the top of the screen, start typing the name of the service that you want to restart. When the service appears in the list onscreen, click the service name to view the details.
  4. At the top of the screen, click Stop Bundle. Then click Start Bundle when the start option becomes available.
Related Topics