Creating the Required Node Pools
This topic provides instructions for creating the three types of required node pools:
- The Operator node pool for running the AnzoGraph, Anzo Agent with Anzo Unstructured (AU), Elasticsearch, and Spark operator pods.
- The AnzoGraph node pool for running AnzoGraph application pods.
- The Dynamic node pool for running Anzo Agent with AU, Elasticsearch, and Spark application pods.
Define the Node Pool Requirements
Before creating the node pools, configure the infrastructure requirements for each type of pool. The nodepool_*.conf files in the gcloud/conf.d
directory are sample configuration files that you can use as templates, or you can edit the files directly:
- nodepool_operator.conf defines the requirements for the Operator node pool.
- nodepool_anzograph.conf defines the requirements for the AnzoGraph node pool.
- nodepool_dynamic.conf defines the requirements for the Dynamic node pool.
The additional AnzoGraph and Dynamic node pool configuration files, nodepool_anzograph_tuner.yaml and nodepool_dynamic_tuner.yaml, configure the kernel-level tuning and security policies to apply to AnzoGraph and Dynamic runtime environments. Do not make changes to the files. There is a stage during node pool creation when the script prompts, Do you want to tune the nodepools?. It is important to answer y (yes) so that the kernel tuning and security policies are applied.
Each type of node pool configuration file contains the following parameters. Descriptions of the parameters and guidance on specifying the appropriate values for each type of node pool are provided below.
DOMAIN="<domain>"
KIND="<kind>"
GCLOUD_CLUSTER_REGION=${GCLOUD_CLUSTER_REGION:-"<region>"} GCLOUD_NODE_TAINTS="<node-taints>" GCLOUD_PROJECT_ID=${GCLOUD_PROJECT_ID:-"<project>"}
GKE_IMAGE_TYPE="<image-type>"
GKE_NODE_VERSION="<version>"
K8S_CLUSTER_NAME=${K8S_CLUSTER_NAME:-"<cluster>"}
NODE_LABELS="<node-labels>"
MACHINE_TYPES="<machine-type>"
TAGS="<tags>"
METADATA="<metadata>"
MAX_PODS_PER_NODE=<max-pods-per-node>
MAX_NODES=<max-nodes>
MIN_NODES=<min-nodes>
NUM_NODES=<num-nodes>
DISK_SIZE="<disk-size>"
DISK_TYPE="<disk-type>"
DOMAIN
The name of the domain that hosts the node pool. This is typically prefaced with the name of the organization.
Node Pool Type | Sample DOMAIN Value |
---|---|
Operator | csi-operator |
AnzoGraph | csi-anzograph |
Dynamic | csi-dynamic |
KIND
This parameter classifies the node pool in terms of kernel tuning and the type of pods that the node pool will host.
Node Pool Type | Required KIND Value |
---|---|
Operator | operator |
AnzoGraph | anzograph |
Dynamic | dynamic |
GCLOUD_CLUSTER_REGION
The compute region for the GKE cluster. This parameter maps to the gcloud container cluster --region
option. For example, us-central1.
GCLOUD_NODE_TAINTS
This parameter defines the type of pods that are allowed to be placed in this node pool. When a pod is scheduled for deployment, the scheduler relies on this value to determine whether the pod belongs in this pool. If a pod has a toleration that is not compatible with this taint, the pod is rejected from the pool. The recommended values below specify that operator pods are allowed in the Operator node pool, AnzoGraph pods are allowed in the AnzoGraph node pool, and dynamic pods are allowed in the Dynamic node pool. The NoSchedule value means a toleration is required and pods without the appropriate toleration will not be allowed in the pool. In addition, the values specify that pods should not be placed on preemptible nodes.
Node Pool Type | Recommended GCLOUD_NODE_TAINTS Value |
---|---|
Operator | cambridgesemantics.com/dedicated=operator:NoSchedule,
cloud.google.com/gke-preemptible="false":NoSchedule |
AnzoGraph | cambridgesemantics.com/dedicated=anzograph:NoSchedule, cloud.google.com/gke-preemptible="false":PreferNoSchedule |
Dynamic | cambridgesemantics.com/dedicated=dynamic:NoSchedule,
cloud.google.com/gke-preemptible="false":NoSchedule |
GCLOUD_PROJECT_ID
The Project ID for the node pool. This parameter maps to the gcloud-wide --project
option. The value should match the Project ID for the GKE cluster. For example, cloud-project-1592.
GKE_IMAGE_TYPE
The base operating system that the nodes in the node pool will run on. This parameter maps to the gcloud container cluster --image-type
option. This value must be cos_containerd.
GKE_NODE_VERSION
The Kubernetes version to use for nodes in the node pool. This parameter maps to the gcloud container cluster --node-version
option.
Cambridge Semantics recommends that you specify the same version as the GKE_MASTER_VERSION.
K8S_CLUSTER_NAME
The name of the GKE cluster to add the node pool to. For example, csi-k8s-cluster.
NODE_LABELS
A comma-separated list of key/value pairs that define the type of pods that can be placed on the nodes in this node pool. Labels are used to attract pods to nodes, while taints (GCLOUD_NODE_TAINTS) are used to repel other types of pods from being placed in this node pool.
For example, the following labels specify that the purpose of the nodes in each pool is to host operator, anzograph, or dynamic pods.
Node Pool Type | Recommended NODE_LABELS Value |
---|---|
Operator | cambridgesemantics.com/node-purpose=operator |
AnzoGraph | cambridgesemantics.com/node-purpose=anzograph |
Dynamic | cambridgesemantics.com/node-purpose=dynamic |
MACHINE_TYPES
A space-separated list of the machine types that can be used for the nodes in this node pool. This parameter maps to the gcloud container cluster --machine-type
option. If you list multiple machine types, the node pool creation script prompts you to create multiple node pools of the same KIND, one pool for each machine type.
Node Pool Type | Sample MACHINE_TYPES Value |
---|---|
Operator | n1-standard-1 |
AnzoGraph | n1-standard-16 n1-standard-32 n1-standard-64 |
Dynamic | n1-standard-4 |
For more guidance on determining the instance types to use for nodes in the required node pools, see Compute Resource Planning.
TAGS
A comma-separated list of strings to add to the instances in the node pool to classify the VMs. This parameter maps to the gcloud container cluster --tags
option. For example, csi-anzo.
METADATA
The compute engine metadata (in the format key=val,key=val) to make available to the guest operating system running on nodes in the node pool. This parameter maps to the gcloud container cluster --metadata
option.
Including disable-legacy-endpoints=true is required to ensure that legacy metadata APIs are disabled. For more information about the option, see Protecting Cluster Metadata in the GKE documentation.
MAX_PODS_PER_NODE
The maximum number of pods that can be hosted on a node in this node pool. This parameter maps to the gcloud container cluster --max-pods-per-node
option. In addition to Anzo application pods, this limit also needs to account for K8s service pods and helper pods. Cambridge Semantics recommends that you set this value to at least 16 for all node pool types.
MAX_NODES
The maximum number of nodes in the node pool. This parameter maps to the gcloud container cluster --max-nodes
option.
Node Pool Type | Sample MAX_NODES Value |
---|---|
Operator | 8 |
AnzoGraph | 64 |
Dynamic | 64 |
MIN_NODES
The minimum number of nodes in the node pool. This parameter maps to the gcloud container cluster --min-nodes
option. If you set the minimum nodes to 0 for each node pool type, nodes will not be provisioned unless the relevant type of pod is scheduled for deployment.
NUM_NODES
The number of nodes to deploy when the node pool is created. This value must be set to at least 1. When you create the node pool, at least one node in the pool needs to be deployed as well. However, if the GKE cluster autoscaler addon is enabled, the autoscaler will deprovision this node because it is not in use.
Depending on the version of gcloud that you are using, you may be able to set NUM_NODES to 0. Recent versions of gcloud added support for creating node pools without deploying any nodes.
DISK_SIZE
The size of the boot disks on the nodes. This parameter maps to the gcloud container cluster --disk-size
option.
Node Pool Type | Sample DISK_SIZE Value |
---|---|
Operator | 50GB |
AnzoGraph | 200GB |
Dynamic | 100GB |
DISK_TYPE
The type of boot disk to use. This parameter maps to the gcloud container cluster --disk-type
option.
Node Pool Type | Sample DISK_TYPE Value |
---|---|
Operator | pd-standard |
AnzoGraph | pd-ssd |
Dynamic | pd-ssd |
Example Configuration Files
Example completed configuration files for each type of node pool are shown below.
Operator Node Pool
The example below shows a configured nodepool_operator.conf file.
DOMAIN="csi-operator" KIND="operator" GCLOUD_NODE_TAINTS="cambridgesemantics.com/dedicated=operator:NoSchedule,cloud.google.com/gke-preemptible="false":NoSchedule" GCLOUD_CLUSTER_REGION=${GCLOUD_CLUSTER_REGION:-"us-central1"} GKE_IMAGE_TYPE="cos_containerd" GKE_NODE_VERSION="1.19.9-gke.1900" GCLOUD_PROJECT_ID=${GCLOUD_PROJECT_ID:-"cloud-project-1592"} K8S_CLUSTER_NAME=${K8S_CLUSTER_NAME:-"csi-k8s-cluster"} NODE_LABELS="cambridgesemantics.com/node-purpose=operator,cambridgesemantics.com/description=k8snode" MACHINE_TYPES="n1-standard-1" TAGS="csi-anzo" METADATA="disable-legacy-endpoints=true" MAX_PODS_PER_NODE=16 MAX_NODES=8 MIN_NODES=0 NUM_NODES=1 DISK_SIZE="50Gb" DISK_TYPE="pd-standard"
AnzoGraph Node Pool
The example below shows a configured nodepool_anzograph.conf file.
DOMAIN="csi-anzograph" KIND="anzograph" GCLOUD_CLUSTER_REGION=${GCLOUD_CLUSTER_REGION:-"us-central1"} GCLOUD_NODE_TAINTS="cambridgesemantics.com/dedicated=anzograph:NoSchedule,cloud.google.com/gke-preemptible="false":PreferNoSchedule" GCLOUD_PROJECT_ID=${GCLOUD_PROJECT_ID:-"cloud-project-1592"} GKE_IMAGE_TYPE="cos_containerd" GKE_NODE_VERSION="1.19.9-gke.1900" K8S_CLUSTER_NAME=${K8S_CLUSTER_NAME:-"csi-k8s-cluster"} NODE_LABELS="cambridgesemantics.com/node-purpose=anzograph,cambridgesemantics.com/description=k8snode" MACHINE_TYPES="n1-standard-16 n1-standard-32 n1-standard-64" TAGS="csi-anzo" METADATA="disable-legacy-endpoints=true" MAX_PODS_PER_NODE=16 MAX_NODES=64 MIN_NODES=0 NUM_NODES=1 DISK_SIZE="200Gb" DISK_TYPE="pd-ssd"
Dynamic Node Pool
The example below shows a configured nodepool_dynamic.conf file.
DOMAIN="csi-dynamic" KIND="dynamic" GCLOUD_CLUSTER_REGION=${GCLOUD_CLUSTER_REGION:-"us-central1"} GCLOUD_NODE_TAINTS="cambridgesemantics.com/dedicated=dynamic:NoSchedule,cloud.google.com/gke-preemptible="false":NoSchedule" GCLOUD_PROJECT_ID=${GCLOUD_PROJECT_ID:-"cloud-project-1592"} GKE_IMAGE_TYPE="cos_containerd" GKE_NODE_VERSION="1.19.9-gke.1900" K8S_CLUSTER_NAME=${K8S_CLUSTER_NAME:-"csi-k8s-cluster"} NODE_LABELS="cambridgesemantics.com/node-purpose=anzograph,cambridgesemantics.com/description=k8snode" MACHINE_TYPES="n1-standard-4" TAGS="csi-anzo" METADATA="disable-legacy-endpoints=true" MAX_PODS_PER_NODE=16 MAX_NODES=64 MIN_NODES=0 NUM_NODES=1 DISK_SIZE="100Gb" DISK_TYPE="pd-ssd"
Create the Node Pools
After defining the requirements for the node pools, run the create_nodepools.sh script in the gcloud
directory to create each type of node pool. Run the script with the following command. Run it once for each type of pool. The arguments are described below.
./create_nodepools.sh -c <config_file_name> [ -d <config_file_directory> ] [ -f | --force ] [ -h | --help ]
-c <config_file_name>
This is a required argument that specifies the name of the configuration file (i.e., nodepool_operator.conf, nodepool_anzograph.conf, or nodepool_dynamic.conf) that supplies the node pool requirements. For example, -c nodepool_dynamic.conf.
-d <config_file_directory>
This is an optional argument that specifies the path and directory name for the configuration file specified for the -c argument. If you are using the original gcloud
directory file structure and the configuration file is in the conf.d
directory, you do not need to specify the -d argument. If you created a separate directory structure for different Anzo environments, include the -d option. For example, -d /gcloud/env1/conf.
-f | --force
This is an optional argument that controls whether the script prompts for confirmation before proceeding with each stage involved in creating the node pool. If -f (--force) is specified, the script assumes the answer is "yes" to all prompts and does not display them.
-h | --help
This argument is an optional flag that you can specify to display the help from the create_nodepools.sh script.
For example, the following command runs the create_nodepools script, using nodepool_operator.conf as input to the script. Since nodepool_operator.conf is in the conf.d directory, the -d argument is excluded:
./create_nodepools.sh -c nodepool_operator.conf
The script validates that the required software packages are installed and that the versions are compatible with the deployment. It also displays an overview of the deployment details based on the values in the specified configuration file. For example:
Operating System : CentOS Linux - Google Cloud SDK: 322.0.0 alpha: 2021.01.05 beta: 2021.01.05 bq: 2.0.64 core: 2021.01.05 gsutil: 4.57 kubectl cli version: Client Version: v1.19.12 valid Deployment details: Project : cloud-project-1592 Region : us-central1 GKE Cluster : csi-k8s-cluster
The script then prompts you to proceed with deploying each component of the node pool. Type y and press Enter to proceed with the configuration.
When creating the AnzoGraph and Dynamic node pools, there is a stage when the script prompts, Do you want to tune the nodepools?. It is important to answer y (yes) so that the kernel tuning and security policies from the related nodepool_*_tuner.yaml file are applied to the node pool configuration.
Once the Operator, AnzoGraph, and Dynamic node pools are created, the next step is to create a Cloud Location in Anzo so that Anzo can connect to the GKE cluster and deploy applications. See Connecting to a Cloud Location.