Deploying an Anzo Unstructured Cluster

This topic provides instructions for deploying an Anzo Unstructured (AU) cluster. See Anzo Unstructured Requirements and Recommendations for information about cluster requirements.

Important Since the Anzo Unstructured cluster will access the shared file store, it is important to install and run the software with the same service account that runs Anzo. For more information, see Anzo Service User Account Requirements.
  1. Deploy the Leader Node
  2. Deploy the Worker Nodes
  3. Anzo Unstructured Cluster Configuration

AU Cluster Upgrade Notes

The steps to upgrade the Anzo Unstructured software are the same as the deployment instructions below. When you update the existing installation, each prompt defaults to the value that is specified for the current deployment. You can press Enter through the prompts to retain the existing settings. The last step in the process, however, asks if you want to overwrite files in the <AnzoDU_install_dir>/etc directory that have been modified. Cambridge Semantics recommends that you choose ya (Yes To All) to overwrite all files in that directory so that important options from the version you are upgrading to are deployed to your environment. If you have customized files in the etc directory, create a backup copy of the directory before starting the upgrade so that you can refer to the backup files when customizing the new version.

Deploy the Leader Node

Follow the instructions below to deploy the Anzo Unstructured leader node.

Important Complete the steps below as the Anzo service user.
  1. Copy the Anzo DU installation script to the leader host server and then run the following command to make the script executable:
    chmod +x script_name
  2. Run the following command to start the installation wizard:
    ./script_name

    The script unpacks the JRE and then waits for input before starting the installation.

  3. Press Enter to start the installation.
  4. Review the software license agreement. Press Enter to scroll through the terms. At the end of the agreement, type 1 to accept the terms or type 2 to disagree and stop the installation.
  5. At the prompt that asks which components to install, type 1 (Leader) and then press Enter.
  6. Specify the directory to install Anzo DU. Press Enter to accept the default installation path or type an alternate path and then press Enter.
  7. The wizard prompts for the IP address of this leader instance. The wizard defaults to the IP address of the server. Press Enter to accept the default value. If necessary, type a different IP address, and then press Enter.
  8. The wizard prompts for any additional leader node IP addresses. Typically there is one leader node and this value is specified as the same IP address as the previous step. If you set up additional leader nodes for redundancy, however, enter a comma separated list of the alternate nodes. Otherwise, accept the default value and press Enter.
  9. Specify the maximum amount of memory (in MB) that this leader instance can use. The install wizard lists the total RAM available and chooses 1/2 of the total memory as the default value. Adjust the value as needed or accept the default value and then press Enter.

    The wizard proceeds to install Anzo DU according to the values that you specified.

  10. When the installation is complete, run the following command to start the leader instance:
    ./install_path/AnzoDU_root_dir/leader start

    For example:

    ./opt/AnzoDU/leader start

Deploy the Worker Nodes

Follow the instructions below to deploy the Anzo Distributed Unstructured worker nodes.

Important Complete the steps below as the Anzo service user.
  1. Make sure that the worker host servers have access to the Anzo shared file system and meet the requirements in Anzo Unstructured Cluster Requirements and Recommendations.
  2. Copy the Anzo DU installation script to each of the worker host servers and then run the following command to make the script executable:
    chmod +x script_name
  3. Run the following command to start the installation wizard:
    ./script_name

    The script unpacks the JRE and then waits for input before starting the installation.

  4. Press Enter to start the installation.
  5. Review the software license agreement. Press Enter to scroll through the terms. At the end of the agreement, type 1 to accept the terms or type 2 to disagree and stop the installation.
  6. At the prompt that asks which components to install, type 2 (Worker) and then press Enter.
  7. Specify the directory to install Anzo DU. Press Enter to accept the default installation path or type an alternate path and then press Enter.
  8. The wizard prompts for the IP address to use for this worker node. The wizard defaults to the IP address of the server. Press Enter to accept the default value. If necessary, type a different IP address, and then press Enter.
  9. The wizard prompts you to specify the maximum number of service instances for this worker node. Each service instance processes one unstructured document at a time. The default value is 2 instances. Press Enter to accept the default or specify another value and then press Enter.
  10. Specify the port to use for this worker. The wizard defaults to port 2552. Press Enter to accept the default value or type a different port and then press Enter.
  11. The wizard prompts you to enter the IP address of the leader node. Specify the IP address for the leader instance that you deployed in the procedure above. If you deployed multiple leader nodes, specify each leader's IP address in a comma separated list.
  12. Specify the maximum amount of memory (in MB) that this worker instance can use. The install wizard lists the total RAM available and chooses 1/2 of the total memory as the default value. Adjust the value as needed or accept the default value and then press Enter.

    The wizard proceeds to install Anzo DU according to the values that you specified.

  13. When the installation is complete, run the following command to start the worker instance:
    ./install_path/AnzoDU_root_dir/worker start

    For example:

    ./opt/AnzoDU/worker start
  14. Repeat the steps above for each worker instance in the cluster.

Anzo Unstructured Cluster Configuration

You do not need to perform additional configuration after the initial deployment of an Anzo Unstructured cluster. To review the configuration that Anzo creates based on the values specified during the installation, view the Distributed Pipeline options in Server Settings in the Administration menu. For more information, see Changing the Distributed Pipeline Configuration.

Important Any time the AU leader instance is restarted, the following two services must be restarted in Anzo:
  • Anzo Server Akka Cluster Integration
  • Anzo Unstructured Distributed

To restart a service:

  1. Log in to the Anzo console, expand the Administration menu, and click Advanced Configuration.
  2. On the Advanced Configuration screen, click the I understand and accept the risk link to view the Anzo bundles.
  3. In the Search field at the top of the screen, start typing the name of the service that you want to restart. When the service appears in the list onscreen, click the service name to view the details.
  4. At the top of the screen, click Stop Bundle. Then click Start Bundle when the start option becomes available.
Related Topics