Creating the Required Node Pools

This topic provides instructions for creating the three types of required node pools:

  • The Operator node pool for running the AnzoGraph, Anzo Agent with Anzo Unstructured (AU), Elasticsearch, and Spark operator pods.
  • The AnzoGraph node pool for running AnzoGraph application pods.
  • The Dynamic node pool for running Anzo Agent with AU, Elasticsearch, and Spark application pods.
For more information about the node pools, see Node Pool Requirements.

Define the Node Pool Requirements

Before creating the node pools, configure the infrastructure requirements for each type of pool. The nodepool_*.conf files in the gcloud/conf.d directory are sample configuration files that you can use as templates, or you can edit the files directly:

  • nodepool_operator.conf defines the requirements for the Operator node pool.
  • nodepool_anzograph.conf defines the requirements for the AnzoGraph node pool.
  • nodepool_dynamic.conf defines the requirements for the Dynamic node pool.

The additional AnzoGraph and Dynamic node pool configuration files, nodepool_anzograph_tuner.yaml and nodepool_dynamic_tuner.yaml, configure the kernel-level tuning and security policies to apply to AnzoGraph and Dynamic runtime environments. Do not make changes to the files. There is a stage during node pool creation when the script prompts, Do you want to tune the nodepools?. It is important to answer y (yes) so that the kernel tuning and security policies are applied.

Each type of node pool configuration file contains the following parameters. Descriptions of the parameters and guidance on specifying the appropriate values for each type of node pool are provided below.

DOMAIN="<domain>"
KIND="<kind>"
GCLOUD_CLUSTER_REGION=${GCLOUD_CLUSTER_REGION:-"<region>"} GCLOUD_NODE_TAINTS="<node-taints>" GCLOUD_PROJECT_ID=${GCLOUD_PROJECT_ID:-"<project>"}
GKE_IMAGE_TYPE="<image-type>"
GKE_NODE_VERSION="<version>"
K8S_CLUSTER_NAME=${K8S_CLUSTER_NAME:-"<cluster>"}
NODE_LABELS="<node-labels>"
MACHINE_TYPES="<machine-type>"
TAGS="<tags>"
METADATA="<metadata>"
MAX_PODS_PER_NODE=<max-pods-per-node>
MAX_NODES=<max-nodes>
MIN_NODES=<min-nodes>
NUM_NODES=<num-nodes>
DISK_SIZE="<disk-size>"
DISK_TYPE="<disk-type>"

DOMAIN

The name of the domain that hosts the node pool. This is typically prefaced with the name of the organization.

Node Pool Type Sample DOMAIN Value
Operator csi-operator
AnzoGraph csi-anzograph
Dynamic csi-dynamic

KIND

This parameter classifies the node pool in terms of kernel tuning and the type of pods that the node pool will host.

Node Pool Type Required KIND Value
Operator operator
AnzoGraph anzograph
Dynamic dynamic

GCLOUD_CLUSTER_REGION

The compute region for the GKE cluster. This parameter maps to the gcloud container cluster --region option. For example, us-central1.

GCLOUD_NODE_TAINTS

This parameter defines the type of pods that are allowed to be placed in this node pool. When a pod is scheduled for deployment, the scheduler relies on this value to determine whether the pod belongs in this pool. If a pod has a toleration that is not compatible with this taint, the pod is rejected from the pool. The recommended values below specify that operator pods are allowed in the Operator node pool, AnzoGraph pods are allowed in the AnzoGraph node pool, and dynamic pods are allowed in the Dynamic node pool. The NoSchedule value means a toleration is required and pods without the appropriate toleration will not be allowed in the pool. In addition, the values specify that pods should not be placed on preemptible nodes.

Node Pool Type Recommended GCLOUD_NODE_TAINTS Value
Operator cambridgesemantics.com/dedicated=operator:NoSchedule,
cloud.google.com/gke-preemptible="false":NoSchedule
AnzoGraph cambridgesemantics.com/dedicated=anzograph:NoSchedule,
cloud.google.com/gke-preemptible="false":PreferNoSchedule
Dynamic cambridgesemantics.com/dedicated=dynamic:NoSchedule,
cloud.google.com/gke-preemptible="false":NoSchedule

GCLOUD_PROJECT_ID

The Project ID for the node pool. This parameter maps to the gcloud-wide --project option. The value should match the Project ID for the GKE cluster. For example, cloud-project-1592.

GKE_IMAGE_TYPE

The base operating system that the nodes in the node pool will run on. This parameter maps to the gcloud container cluster --image-type option. This value must be cos_containerd.

GKE_NODE_VERSION

The Kubernetes version to use for nodes in the node pool. This parameter maps to the gcloud container cluster --node-version option.

Cambridge Semantics recommends that you specify the same version as the GKE_MASTER_VERSION.

K8S_CLUSTER_NAME

The name of the GKE cluster to add the node pool to. For example, csi-k8s-cluster.

NODE_LABELS

A comma-separated list of key/value pairs that define the type of pods that can be placed on the nodes in this node pool. Labels are used to attract pods to nodes, while taints (GCLOUD_NODE_TAINTS) are used to repel other types of pods from being placed in this node pool.

For example, the following labels specify that the purpose of the nodes in each pool is to host operator, anzograph, or dynamic pods.

Node Pool Type Recommended NODE_LABELS Value
Operator cambridgesemantics.com/node-purpose=operator
AnzoGraph cambridgesemantics.com/node-purpose=anzograph
Dynamic cambridgesemantics.com/node-purpose=dynamic

MACHINE_TYPES

A space-separated list of the machine types that can be used for the nodes in this node pool. This parameter maps to the gcloud container cluster --machine-type option. If you list multiple machine types, the node pool creation script prompts you to create multiple node pools of the same KIND, one pool for each machine type.

Node Pool Type Sample MACHINE_TYPES Value
Operator n1-standard-1
AnzoGraph n1-standard-16 n1-standard-32 n1-standard-64
Dynamic n1-standard-4

For more guidance on determining the instance types to use for nodes in the required node pools, see Compute Resource Planning.

TAGS

A comma-separated list of strings to add to the instances in the node pool to classify the VMs. This parameter maps to the gcloud container cluster --tags option. For example, csi-anzo.

METADATA

The compute engine metadata (in the format key=val,key=val) to make available to the guest operating system running on nodes in the node pool. This parameter maps to the gcloud container cluster --metadata option.

Including disable-legacy-endpoints=true is required to ensure that legacy metadata APIs are disabled. For more information about the option, see Protecting Cluster Metadata in the GKE documentation.

MAX_PODS_PER_NODE

The maximum number of pods that can be hosted on a node in this node pool. This parameter maps to the gcloud container cluster --max-pods-per-node option. In addition to Anzo application pods, this limit also needs to account for K8s service pods and helper pods. Cambridge Semantics recommends that you set this value to at least 16 for all node pool types.

MAX_NODES

The maximum number of nodes in the node pool. This parameter maps to the gcloud container cluster --max-nodes option.

Node Pool Type Sample MAX_NODES Value
Operator 8
AnzoGraph 64
Dynamic 64

MIN_NODES

The minimum number of nodes in the node pool. This parameter maps to the gcloud container cluster --min-nodes option. If you set the minimum nodes to 0 for each node pool type, nodes will not be provisioned unless the relevant type of pod is scheduled for deployment.

NUM_NODES

The number of nodes to deploy when the node pool is created. This value must be set to at least 1. When you create the node pool, at least one node in the pool needs to be deployed as well. However, if the GKE cluster autoscaler addon is enabled, the autoscaler will deprovision this node because it is not in use.

Depending on the version of gcloud that you are using, you may be able to set NUM_NODES to 0. Recent versions of gcloud added support for creating node pools without deploying any nodes.

DISK_SIZE

The size of the boot disks on the nodes. This parameter maps to the gcloud container cluster --disk-size option.

Node Pool Type Sample DISK_SIZE Value
Operator 50GB
AnzoGraph 200GB
Dynamic 100GB

DISK_TYPE

The type of boot disk to use. This parameter maps to the gcloud container cluster --disk-type option.

Node Pool Type Sample DISK_TYPE Value
Operator pd-standard
AnzoGraph pd-ssd
Dynamic pd-ssd

Example Configuration Files

Example completed configuration files for each type of node pool are shown below.

Operator Node Pool

The example below shows a configured nodepool_operator.conf file.

DOMAIN="csi-operator"
KIND="operator"
GCLOUD_NODE_TAINTS="cambridgesemantics.com/dedicated=operator:NoSchedule,cloud.google.com/gke-preemptible="false":NoSchedule"
GCLOUD_CLUSTER_REGION=${GCLOUD_CLUSTER_REGION:-"us-central1"}
GKE_IMAGE_TYPE="cos_containerd"
GKE_NODE_VERSION="1.19.9-gke.1900"
GCLOUD_PROJECT_ID=${GCLOUD_PROJECT_ID:-"cloud-project-1592"}
K8S_CLUSTER_NAME=${K8S_CLUSTER_NAME:-"csi-k8s-cluster"}
NODE_LABELS="cambridgesemantics.com/node-purpose=operator,cambridgesemantics.com/description=k8snode"
MACHINE_TYPES="n1-standard-1"
TAGS="csi-anzo"
METADATA="disable-legacy-endpoints=true"
MAX_PODS_PER_NODE=16
MAX_NODES=8
MIN_NODES=0
NUM_NODES=1
DISK_SIZE="50Gb"
DISK_TYPE="pd-standard"

AnzoGraph Node Pool

The example below shows a configured nodepool_anzograph.conf file.

DOMAIN="csi-anzograph"
KIND="anzograph"
GCLOUD_CLUSTER_REGION=${GCLOUD_CLUSTER_REGION:-"us-central1"}
GCLOUD_NODE_TAINTS="cambridgesemantics.com/dedicated=anzograph:NoSchedule,cloud.google.com/gke-preemptible="false":PreferNoSchedule"
GCLOUD_PROJECT_ID=${GCLOUD_PROJECT_ID:-"cloud-project-1592"}
GKE_IMAGE_TYPE="cos_containerd"
GKE_NODE_VERSION="1.19.9-gke.1900"
K8S_CLUSTER_NAME=${K8S_CLUSTER_NAME:-"csi-k8s-cluster"}
NODE_LABELS="cambridgesemantics.com/node-purpose=anzograph,cambridgesemantics.com/description=k8snode"
MACHINE_TYPES="n1-standard-16 n1-standard-32 n1-standard-64"
TAGS="csi-anzo"
METADATA="disable-legacy-endpoints=true"
MAX_PODS_PER_NODE=16
MAX_NODES=64
MIN_NODES=0
NUM_NODES=1
DISK_SIZE="200Gb"
DISK_TYPE="pd-ssd"

Dynamic Node Pool

The example below shows a configured nodepool_dynamic.conf file.

DOMAIN="csi-dynamic"
KIND="dynamic"
GCLOUD_CLUSTER_REGION=${GCLOUD_CLUSTER_REGION:-"us-central1"}
GCLOUD_NODE_TAINTS="cambridgesemantics.com/dedicated=dynamic:NoSchedule,cloud.google.com/gke-preemptible="false":NoSchedule"
GCLOUD_PROJECT_ID=${GCLOUD_PROJECT_ID:-"cloud-project-1592"}
GKE_IMAGE_TYPE="cos_containerd"
GKE_NODE_VERSION="1.19.9-gke.1900"
K8S_CLUSTER_NAME=${K8S_CLUSTER_NAME:-"csi-k8s-cluster"}
NODE_LABELS="cambridgesemantics.com/node-purpose=anzograph,cambridgesemantics.com/description=k8snode"
MACHINE_TYPES="n1-standard-4"
TAGS="csi-anzo"
METADATA="disable-legacy-endpoints=true"
MAX_PODS_PER_NODE=16
MAX_NODES=64
MIN_NODES=0
NUM_NODES=1
DISK_SIZE="100Gb"
DISK_TYPE="pd-ssd"

Create the Node Pools

After defining the requirements for the node pools, run the create_nodepools.sh script in the gcloud directory to create each type of node pool. Run the script with the following command. Run it once for each type of pool. The arguments are described below.

./create_nodepools.sh -c <config_file_name> [ -d <config_file_directory> ] [ -f | --force ] [ -h | --help ]

-c <config_file_name>

This is a required argument that specifies the name of the configuration file (i.e., nodepool_operator.conf, nodepool_anzograph.conf, or nodepool_dynamic.conf) that supplies the node pool requirements. For example, -c nodepool_dynamic.conf.

-d <config_file_directory>

This is an optional argument that specifies the path and directory name for the configuration file specified for the -c argument. If you are using the original gcloud directory file structure and the configuration file is in the conf.d directory, you do not need to specify the -d argument. If you created a separate directory structure for different Anzo environments, include the -d option. For example, -d /gcloud/env1/conf.

-f | --force

This is an optional argument that controls whether the script prompts for confirmation before proceeding with each stage involved in creating the node pool. If -f (--force) is specified, the script assumes the answer is "yes" to all prompts and does not display them.

-h | --help

This argument is an optional flag that you can specify to display the help from the create_nodepools.sh script.

For example, the following command runs the create_nodepools script, using nodepool_operator.conf as input to the script. Since nodepool_operator.conf is in the conf.d directory, the -d argument is excluded:

./create_nodepools.sh -c nodepool_operator.conf

The script validates that the required software packages are installed and that the versions are compatible with the deployment. It also displays an overview of the deployment details based on the values in the specified configuration file. For example:

Operating System   : CentOS Linux
- Google Cloud SDK: 322.0.0
  alpha: 2021.01.05
  beta: 2021.01.05
  bq: 2.0.64
  core: 2021.01.05
  gsutil: 4.57
  kubectl cli version: Client Version: v1.19.12
  valid

Deployment details:
    Project            : cloud-project-1592
    Region             : us-central1
    GKE Cluster        : csi-k8s-cluster

The script then prompts you to proceed with deploying each component of the node pool. Type y and press Enter to proceed with the configuration.

When creating the AnzoGraph and Dynamic node pools, there is a stage when the script prompts, Do you want to tune the nodepools?. It is important to answer y (yes) so that the kernel tuning and security policies from the related nodepool_*_tuner.yaml file are applied to the node pool configuration.

Once the Operator, AnzoGraph, and Dynamic node pools are created, the next step is to create a Cloud Location in Anzo so that Anzo can connect to the GKE cluster and deploy applications. See Connecting to a Cloud Location.

Related Topics