Installing Anzo

This topic provides instructions for installing Anzo. For information about server requirements, see Anzo Requirements.

  1. Complete the Pre-Installation Configuration
  2. Install and Configure Anzo
  3. Complete the Post-Installation Configuration

Complete the Pre-Installation Configuration

Make Sure the Anzo Service User Account is Created

It is important to work with your IT organization to ensure that an Anzo service user account is created at the enterprise level. The user account needs to be associated with a central directory server (LDAP) so that it is available for installing and running Anzo components across environments. For more information, see Anzo Service Account Requirements.

If necessary, you can create a temporary user account on the Anzo host server. Note that creating the account locally can cause issues when migrating Anzo or integrating with a central LDAP server. The service account should meet the following requirements:

  • The service account should not have root-user privileges.
  • The account must have read and write permissions for the Anzo installation directory. The default installation directory is /opt/Anzo.
  • The account must have read and write access to the shared file store, such as the NFS mount location, where Anzo will read and write files during the data onboarding processes.

    If your organization will use Anzo Unstructured with Elasticsearch to onboard unstructured data, it is especially important to install and run Anzo as a non-root user. Elasticsearch cannot be run by a root user, but it must have access to the data that Anzo writes on the shared file store. When Anzo is run as root the data that it generates is owned by root and Elasticsearch cannot access it.

Configure User Resource Limits

Cambridge Semantics recommends that you tune the user resource limits (ulimits) for your Linux distribution to increase the limits for the following resources:

  • Increase the limit for the following resources to at least 65535:
    • open files (nofile)
    • max user processes (nproc)
  • Increase the limit for the following resources to infinity:
    • address space (as)
    • CPU time (cpu)
    • file locks (locks)
    • file size (fsize)
    • max memory size (memlock)

To view the current ulimits, run ulimit -a. To permanently change ulimits, modify the /etc/security/limits.conf file. For information, see How to set ulimit values in the RHEL support documentation.

Typically, as part of post-installation configuration, a systemd service is set up to start and stop the Anzo process. When systemd starts a process, however, it uses the limits that are defined in the systemd service rather than the limits in /etc/security/limits.conf. In addition to changing the ulimits in limits.conf, it is important to set the limits in the Anzo service. The service file contents shown in Configure and Start the Anzo Service includes the recommended ulimit settings.

Install and Configure Anzo

Follow the instructions below to install Anzo. These instructions assume that you have copied the Anzo installation script to the server.

Complete the steps below as the Anzo service user.
  1. If necessary, run the following command to become the Anzo service user:
    # su name

    Where name is the name of the service user. For example:

    # su anzo
  2. If necessary, run the following command to make the Anzo installation script executable:
    chmod +x script_name
  3. Run the following command to start the installation wizard:
    ./script_name

    The script unpacks the JRE and then waits for input before starting the installation.

  4. Press Enter to start the installation.
  5. Review the software license agreement. Press Enter to scroll through the terms. At the end of the agreement, type 1 to accept the terms or type 2 to disagree and stop the installation.
  6. Specify the components to install. Item 1 is the Anzo server; item 2 is the Anzo command line client. To install both components, accept the default value by pressing Enter. Or type 1 to install only the server or 2 to install only the command line client, then press Enter.
  7. Specify the path and directory for the Anzo installation. Press Enter to accept the default installation path or type an alternate path and then press Enter.
  8. Indicate whether you want the installer to create symlinks. Press Enter for yes or type n and press Enter for no.
  9. If you chose to let the installer create symlinks, specify the directory to create the symlinks in. Press Enter to accept the default path or type an alternate path and then press Enter.
  10. Specify the maximum amount of memory (in MB) that the server can use and then press Enter. The installation wizard lists the total RAM available. To meet the minimum memory requirement, the wizard chooses 1/4 of the total memory as the default value. Cambridge Semantics recommends that you allocate at least 1/2 of the total memory to Anzo.

    The wizard installs the components that you selected and then asks if you want to start the Anzo services.

  11. Press Enter to start the Anzo services. When prompted, open a browser and go to the following URL to open the license administration wizard.
    http://<hostname>:8945/

    Where <hostname> is the Anzo server DNS name or IP address. The License Key Information screen appears. For example:

  12. Paste your license key into the box provided and then click Next. If necessary, you can obtain the license key by clicking Retrieve your license key and logging in to your Cambridge Semantics account.
  13. The wizard displays your license details. Review the details and then click Next. The wizard displays the System Configuration screen. For example:

  14. On the left side of the screen, specify the password to use for the system administrator, sysadmin, in the System Password and Verify System Password fields.

    Do not change the system administration user ID. It must be sysadmin. The sysadmin user account has permission to access all Anzo features in the main Anzo application as well as administrative features in the Administration application. In addition, the sysadmin user has read and write access to all of the artifacts (such as data sources, models, and pipelines) that are created by all Anzo users. For more information about the account, see System Administrator.

  15. On the right side of the screen under Advanced Configuration, the Storage Directory setting is displayed. This setting configures the binary store location. By default Anzo stores binary data in <install_path>/Server/data. You can change the location by typing a new path and directory.
  16. Click Finish. The wizard starts configures and restarts the server. The process may take several minutes. Once the server is running, the browser displays the Anzo login screen. Before logging in, there is one more configuration step to complete. Some of the Anzo services will not have started properly because they could not bind to the default HTTP/S ports. The default Anzo HTTP port is 80 and the HTTPS port is 443. Since non-root users cannot access ports below 1000, Anzo services will not be able to access the default ports when Anzo is run by the new service user. The Anzo port settings need to be changed to the non-root ports 8080 and 8443:
    1. On the Anzo server, run the following command to make an SSH connection to the Anzo Command Console as the sysadmin user:
      ssh sysadmin@localhost -p 8022
    2. When prompted, specify the password for the sysadmin user and log in to the Anzo OSGI Command Console.
    3. At the OSGI prompt, run the commands below, followed by exit to exit the console:
      osgi> httpPort 8080
      osgi> httpsPort 8443
      osgi> exit
  17. Run the following command to restart Anzo and complete the port configuration:
    ./<install_path>/Server/AnzoServer restart
  18. When Anzo starts, open the Anzo user interface by going to the following URL in your browser:
    https://<hostname>

    Where <hostname> is the Anzo server DNS name or IP address.

Complete the Post-Installation Configuration

This section provides instructions for completing post-installation tasks.

Route Anzo HTTP/S Ports to Non-Root Ports for User Access

This section provides instructions for configuring the firewall to forward HTTP requests to port 8080 and HTTPS requests to port 8443 so that users can access Anzo without having to specify the new HTTP/S ports.

Root user privileges are required to complete this task.

To re-route Anzo ports using the iptables interface

Run the following commands to route the Anzo ports via the iptables interface:

# iptables -A PREROUTING -t nat -i eth0 -p tcp --dport 80 -j REDIRECT --to-port 8080
# iptables -A PREROUTING -t nat -i eth0 -p tcp --dport 443 -j REDIRECT --to-port 8443
# iptables-save > /etc/sysconfig/iptables

To re-route Anzo ports using the firewalld interface

Run the following commands to route the Anzo ports via the firewalld interface:

# firewall-cmd --permanent --add-forward-port=port=443:proto=tcp:toport=8443
# firewall-cmd --permanent --add-forward-port=port=80:proto=tcp:toport=8080 # firewall-cmd --reload

Change the Local Spark Engine Callback URL to the Non-Root Port

If you plan to use the pre-configured local Anzo Spark ETL engine to run pipelines, the callback URL for the engine must be configured to bind to the new Anzo HTTP port. Follow the instructions below to change the callback URL.

  1. In the Administration application, expand the Connections menu and click ETL Engine Config.
  2. On the ETL Engine Config screen, click the Local Spark Engine to view the configuration details for the engine.
  3. Click the Run tab. Anzo displays the Run screen. For example:

  4. At the bottom of the screen, click the edit icon () next to the Callback URL field (hover your pointer over the field to display the edit icon). Then edit the callback URL value to specify the HTTP port at the end of the IP address. For example:

  5. Click the check mark icon () to save the change.

Configure and Start the Anzo Service

Cambridge Semantics recommends that you configure an Anzo service for starting Anzo automatically as the service user. Follow the instructions below to implement and start the service.

Root user privileges are required to complete this task.
  1. Create a file called anzo-server.service in the /usr/lib/systemd/system directory. For example:
    # vi /usr/lib/systemd/system/anzo-server.service
  2. Add the following contents to anzo-server.service. Placeholder values are shown in bold:
    [Unit]
    Description=Service for Anzo server.
    After=syslog.target network.target local-fs.target remote-fs.target nss-lookup.target
    
    [Service]
    Type=simple
    RemainAfterExit=yes
    LimitCPU=infinity
    LimitNOFILE=65536
    LimitAS=infinity
    LimitNPROC=65536
    LimitMEMLOCK=infinity
    LimitLOCKS=infinity
    LimitFSIZE=infinity
    ExecStart=/install_path/Server/AnzoServer start
    ExecStop=/install_path/Server/AnzoServer stop
    User=service_user_name
    Group=service_user_name
    
    [Install]
    WantedBy=default.target

    Where install_path is the Anzo installation path and directory and service_user_name is the name of the Anzo service user. For example:

    [Unit]
    Description=Service for Anzo server.
    After=syslog.target network.target local-fs.target remote-fs.target nss-lookup.target
    
    [Service]
    Type=simple
    RemainAfterExit=yes
    LimitCPU=infinity
    LimitNOFILE=65536
    LimitAS=infinity
    LimitNPROC=65536
    LimitMEMLOCK=infinity
    LimitLOCKS=infinity
    LimitFSIZE=infinity
    ExecStart=/opt/Anzo/Server/AnzoServer start
    ExecStop=/opt/Anzo/Server/AnzoServer stop
    User=anzo
    Group=anzo
    
    [Install]
    WantedBy=default.target
  3. Save and close the file, and then run the following commands to start and enable the new service:
    # systemctl start anzo-server.service
    # systemctl enable anzo-server.service

    The client displays a message such as the following:

    Created symlink from /etc/systemd/system/default.target.wants/anzo-server.service to
    /usr/lib/systemd/system/anzo-server.service.

Once the service is enabled, Anzo should be running. Any time you start and stop Anzo, run the following systemctl commands: sudo systemctl stop anzo-server and sudo systemctl start anzo-server.

For an introduction to Anzo concepts, an overview of the user interface, basic setup steps, and instructions for building a sample solution from scratch, see the Getting Started Guide.

Related Topics