Creating the Required Node Groups

This topic provides instructions for creating the four types of required node groups:

  • The Common node group for running K8s services such as the Cluster Autoscaler and Load Balancers.
  • The Operator node group for running the AnzoGraph, Anzo Agent with Anzo Unstructured (AU), Elasticsearch, and Spark operator pods.
  • The AnzoGraph node group for running AnzoGraph application pods.
  • The Dynamic node group for running Anzo Agent with AU, Elasticsearch, and Spark application pods.
For more information about the node groups, see Node Pool Requirements.

Define the Node Group Requirements

Before creating the node groups, configure the infrastructure requirements for each type of group. The nodepool_*.yaml object files in the eksctl/conf.d directory are sample configuration files that you can use as templates, or you can edit the files directly:

  • nodepool_common.yaml defines the requirements for the Common node group.
  • nodepool_operator.yaml defines the requirements for the Operator node group.
  • nodepool_anzograph.yaml defines the requirements for the AnzoGraph node group.
  • nodepool_dynamic.yaml defines the requirements for the Dynamic node group.

Each type of node group configuration file contains the following parameters. Descriptions of the parameters and guidance on specifying the appropriate values for each type of node group are provided below.

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: <eks-cluster-name>
  region: <cluster-region>
  tags: 
    <metadata-tag-name>: "<value>"
nodeGroups:
  - name: <node-prefix>
    amiFamily: <ami-type>
    labels: 
      <label-name>: '<value>' 
    instanceType: <instance-type>
    desiredCapacity: <desired-capacity>
    availabilityZones:
    - <zones>
    minSize: <min-size>
    maxSize: <max-size>
    volumeSize: <volume-size>
    maxPodsPerNode: <max-pods>
    iam:
      attachPolicyARNs:
      - <arns>
      withAddonPolicies:
        autoScaler: <auto-scaler>
        imageBuilder: <image-builder>
        efs: <efs>
        CloudWatch: <cloud-watch>
    volumeType: <volume-type>
    privateNetworking: <private-networking>
    securityGroups:
      withShared: <shared-security-group>
      withLocal: <local-security-group>
    ssh:
      allow: <allow-ssh>
      publicKeyName: <public-key-name>
    taints:
      '<taint-name>': '<taint-value>'
    tags:
      '<tag-name>': '<tag-value>'
    asgMetricsCollection:
      - granularity: <granularity>
        metrics:
        - <metric-name>

apiVersion

The version of the schema for this object.

kind

The schema for this object.

name

The name of the EKS cluster that hosts the node group. For example, csi-k8s-cluster.

region

The region that the EKS cluster is deployed in. For example, us-east-1.

tags

A list of any custom tags to add to the AWS resources that are created by eksctl.

name

The prefix to add to the names of the nodes that are deployed in this node group.

Node Group Type Sample nodeGroups name Value
Common common
Operator operator
AnzoGraph anzograph
Dynamic dynamic

amiFamily

The EKS-optimized Amazon Machine Image (AMI) type to use when deploying nodes in the node group. Cambridge Semantics recommends that you specify AmazonLinux2.

labels

A space-separated list of key/value pairs that define the type of pods that can be placed on the nodes in this node group. Labels are used to attract pods to nodes, while taints (described in taints below) are used to repel other types of pods from being placed in this node group. For example, the following labels specify that the purpose of the nodes in the groups are to host operator, anzograph, dynamic, or common pods.

Node Group Type Recommended nodeGroups labels Value
Common cambridgesemantics.com/node-purpose: 'common'
deploy-ca: 'true'
cluster-autoscaler-version: '<version>'
Operator cambridgesemantics.com/node-purpose: 'operator'
AnzoGraph cambridgesemantics.com/node-purpose: 'anzograph'
Dynamic cambridgesemantics.com/node-purpose: 'dynamic'

The additional Common node group label deploy-ca: 'true' identifies this group as the node group to host the Cluster Autoscaler (CA) service. The related cluster-autoscaler-version label identifies the CA version. The version that you specify must have the same major and minor version as the Kubernetes version for the EKS cluster (CLUSTER_VERSION). For example, if the cluster version is 1.19, the CA version must be 1.19.n, where n is a valid CA patch release number, such as 1.19.1. To view the CA releases for your Kubernetes version, see Cluster Autoscaler Releases on GitHub.

instanceType

The EC2 instance type to use for the nodes in the node group.

Node Group Type Sample instanceType Value
Common m5.large
Operator m5.large
AnzoGraph m5.8xlarge
Dynamic m5.2xlarge

For more guidance on determining the instance types to use for nodes in the required node groups, see Compute Resource Planning.

desiredCapacity

The number of nodes to deploy when this node group is created. This value must be set to at least 1. When you create the node group, at least one node in the group needs to be deployed as well. However, if minSize is 0 and the autoScaler addon is enabled, the autoscaler will deprovision this node because it is not in use.

availabilityZones

A list of the Availability Zones to make this node group available to.

minSize

The minimum number of nodes for the node group. If you set the minimum size to 0, nodes will not be provisioned unless a pod is scheduled for deployment in that group.

maxSize

The maximum number of nodes that can be deployed in the node group.

volumeSize

The size (in GB) of the EBS volume to add to the nodes in this node group.

maxPodsPerNode

The maximum number of pods that can be hosted on a node in this node group.In addition to Anzo application pods, this limit also needs to account for K8s service pods and helper pods. Cambridge Semantics recommends that you set this value to at least 16 for all node group types.

attachPolicyARNs

A list of the Amazon Resource Names (ARN) for the IAM policies to attach to the node group. These policies apply at the node level. Include the default node policies as well as any other policies that you want to add. For example:

attachPolicyARNs:
- arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
- arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
- arn:aws:iam::aws:policy/AmazonS3FullAccess

autoScaler

Indicates whether to add an autoscaler to this node group. Cambridge Semantics recommends that you set this value to true.

imageBuilder

Indicates whether to allow this node group to access the full Elastic Container Registry (ECR). Cambridge Semantics recommends that you set this value to true.

efs

Indicates whether to enable access to the persistent volume, Elastic File System (EFS).

CloudWatch

Indicates whether to enable the CloudWatch service, which performs control plane logging when the node group is created.

volumeType

The type of EBS volume to use for the nodes in this node group.

privateNetworking

Indicates whether to isolate the node group from the public internet. Cambridge Semantics recommends that you set this value to true.

withShared

Indicates whether to create a shared security group for this node group to allow communication between the other node groups. Setting this value to true ensures that there is cluster-wide connectivity between all nodes in all node groups.

withLocal

Indicates whether to create a local security group for this node group. This security group controls access to the EKS cluster API. In addition, if SSH is allowed, port 22 will be opened in this security group. Cambridge Semantics recommends that you set this value to true.

allow

Indicates whether to allow SSH access to the nodes in this node group.

publicKeyName

The public key name in EC2 to add to the nodes in this node group. If allow is false, this value is ignored.

taints

This parameter defines the type of pods that are allowed to be placed in this node group. When a pod is scheduled for deployment, the scheduler relies on this value to determine whether the pod belongs in this group. If a pod has a toleration that is not compatible with this taint, the pod is rejected from the group. The following recommended values specify that pods must be operator pods to be deployed in the Operator node group; they must be anzograph pods to be deployed in the AnzoGraph node group; and they must be dynamic pods to be deployed in the Dynamic node group. The NoSchedule value means a toleration is required and pods without a toleration will not be allowed in the group.

Node Group Type Recommended taints Value
Operator 'cambridgesemantics.com/dedicated': 'operator:NoSchedule'
AnzoGraph 'cambridgesemantics.com/dedicated': 'anzograph:NoSchedule'
Dynamic 'cambridgesemantics.com/dedicated': 'dynamic:NoSchedule'

tags

The list of key:value pairs to add to the nodes in this node group. For autoscaling to work, the list of tags must include the namespaced version of the label and taint definitions.

Node Group Recommended tags Value
Common 'k8s.io/cluster-autoscaler/node-template/label/cambridgesemantics.com/node-purpose': 'common'
Operator 'k8s.io/cluster-autoscaler/node-template/label/cambridgesemantics.com/node-purpose': 'operator'
'k8s.io/cluster-autoscaler/node-template/taint/cambridgesemantics.com/dedicated': 'operator:NoSchedule'
'cambridgesemantics.com/node-purpose': 'operator'
AnzoGraph 'k8s.io/cluster-autoscaler/node-template/label/cambridgesemantics.com/node-purpose': 'anzograph'
'k8s.io/cluster-autoscaler/node-template/taint/cambridgesemantics.com/dedicated': 'anzograph:NoSchedule'
'cambridgesemantics.com/node-purpose': 'anzograph'
Dynamic 'k8s.io/cluster-autoscaler/node-template/label/cambridgesemantics.com/node-purpose': 'dynamic'
'k8s.io/cluster-autoscaler/node-template/taint/cambridgesemantics.com/dedicated': 'dynamic:NoSchedule'
'cambridgesemantics.com/node-purpose': 'dynamic'

You can also augment the required tags with any custom tags that you want to include. For information about tagging, see Tagging your Amazon EKS Resources in the Amazon EKS documentation.

asgMetricsCollection

If CloudWatch is enabled, this parameter configures the specific Auto Scaling Group (ASG) metrics to capture as well as the frequency with which to capture the metrics.

granularity

This property is a required property that specifies the frequency with which Amazon EC2 Auto Scaling sends aggregated data to CloudWatch. The only valid value is 1Minute.

metrics

This property lists the specific group-level metrics to collect. If granularity is specified but metrics is omitted, all of the metrics are enabled. For more information and a list of valid values, see AutoScalingGroup MetricsCollection in the AWS CloudFormation documentation.

Example Configuration Files

Example completed configuration files for each type of node group are shown below.

Common Node Group

The example below shows a completed nodepool_common.yaml file.

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: csi-k8s-cluster
  region: us-east-1
  tags:
    description: "K8s cluster Common node group"
nodeGroups:
  - name: common
    amiFamily: AmazonLinux2
    labels: 
      cambridgesemantics.com/node-purpose: 'common'
      deploy-ca: 'true'
      cluster-autoscaler-version: '1.19.1'
    instanceType: m5.large
    desiredCapacity: 1
    availabilityZones:
    - us-east-1a
    minSize: 0
    maxSize: 4
    volumeSize: 50
    maxPodsPerNode: 16
    iam:
      attachPolicyARNs:
      - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
      - arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
      - arn:aws:iam::aws:policy/AmazonS3FullAccess
      withAddonPolicies:
        autoScaler: true
        imageBuilder: true
        efs: true
        CloudWatch: true
    volumeType: gp2
    privateNetworking: true
    securityGroups:
      withShared: true
      withLocal: true
    ssh:
      allow: true
      publicKeyName: common-keypair
    tags:
      'k8s.io/cluster-autoscaler/node-template/label/cambridgesemantics.com/node-purpose': 'common'
    asgMetricsCollection:
      - granularity: 1Minute
        metrics:
        - GroupPendingInstances
        - GroupInServiceInstances
        - GroupTerminatingInstances
        - GroupInServiceCapacity

Operator Node Group

The example below shows a completed nodepool_operator.yaml file.

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: csi-k8s-cluster
  region: us-east-1
  tags:
    description: "K8s cluster Operator node group"
nodeGroups:
  - name: operator
    amiFamily: AmazonLinux2
    labels: 
      cambridgesemantics.com/node-purpose: 'operator'
    instanceType: m5.large
    desiredCapacity: 1
    availabilityZones:
    - us-east-1a
    minSize: 0
    maxSize: 5
    volumeSize: 50
    maxPodsPerNode: 16
    iam:
      attachPolicyARNs:
      - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
      - arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
      - arn:aws:iam::aws:policy/AmazonS3FullAccess
      withAddonPolicies:
        autoScaler: true
        imageBuilder: true
        efs: true
        cloudWatch: true
    volumeType: gp2
    privateNetworking: true
    securityGroups:
      withShared: true
      withLocal: true
    ssh:
      allow: true
      publicKeyName: operator-keypair
    taints:
      'cambridgesemantics.com/dedicated': 'operator:NoSchedule'
    tags:
      'k8s.io/cluster-autoscaler/node-template/label/cambridgesemantics.com/node-purpose': 'operator'
      'k8s.io/cluster-autoscaler/node-template/taint/cambridgesemantics.com/dedicated': 'operator:NoSchedule'
      'cambridgesemantics.com/node-purpose': 'operator'
    asgMetricsCollection:
      - granularity: 1Minute
        metrics:
        - GroupPendingInstances
        - GroupInServiceInstances
        - GroupTerminatingInstances
        - GroupInServiceCapacity

AnzoGraph Node Group

The example below shows a completed nodepool_anzograph.yaml file.

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: csi-k8s-cluster
  region: us-east-1
  tags:
    description: "K8s cluster AnzoGraph node group"
nodeGroups:
  - name: anzograph
    amiFamily: AmazonLinux2
    labels: 
      cambridgesemantics.com/node-purpose: 'anzograph'
    instanceType: m5.8xlarge
    desiredCapacity: 1
    availabilityZones:
    - us-east-1a
    minSize: 0
    maxSize: 12
    volumeSize: 100
    maxPodsPerNode: 16
    iam:
      attachPolicyARNs:
      - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
      - arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
      - arn:aws:iam::aws:policy/AmazonS3FullAccess
      withAddonPolicies:
        autoScaler: true
        imageBuilder: true
        efs: true
        CloudWatch: true
    volumeType: gp2
    privateNetworking: true
    securityGroups:
      withShared: true
      withLocal: true
    ssh:
      allow: true
      publicKeyName: anzograph-keypair
    taints:
      'cambridgesemantics.com/dedicated': 'anzograph:NoSchedule'
    tags:
      'k8s.io/cluster-autoscaler/node-template/label/cambridgesemantics.com/node-purpose': 'anzograph'
      'k8s.io/cluster-autoscaler/node-template/taint/cambridgesemantics.com/dedicated': 'anzograph:NoSchedule'
      'cambridgesemantics.com/node-purpose': 'anzograph'
    asgMetricsCollection:
      - granularity: 1Minute
        metrics:
        - GroupPendingInstances
        - GroupInServiceInstances
        - GroupTerminatingInstances
        - GroupInServiceCapacity

Dynamic Node Group

The example below shows a completed nodepool_dynamic.yaml file.

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: csi-k8s-cluster
  region: us-east-1
  tags:
    description: "K8s cluster Dynamic node group"
nodeGroups:
  - name: dynamic
    amiFamily: AmazonLinux2
    labels: 
      cambridgesemantics.com/node-purpose: 'dynamic'
    instanceType: m5.2xlarge
    desiredCapacity: 1
    availabilityZones:
    - us-east-1a
    minSize: 0
    maxSize: 12
    volumeSize: 100
    maxPodsPerNode: 16
    iam:
      attachPolicyARNs:
      - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
      - arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
      - arn:aws:iam::aws:policy/AmazonS3FullAccess
      withAddonPolicies:
        autoScaler: true
        imageBuilder: true
        efs: true
        CloudWatch: true
    volumeType: gp2
    privateNetworking: true
    securityGroups:
      withShared: true
      withLocal: true
    ssh:
      allow: true
      publicKeyName: dynamic-keypair
    taints:
      'cambridgesemantics.com/dedicated': 'dynamic:NoSchedule'
    tags:
      'k8s.io/cluster-autoscaler/node-template/label/cambridgesemantics.com/node-purpose': 'dynamic'
      'k8s.io/cluster-autoscaler/node-template/taint/cambridgesemantics.com/dedicated': 'dynamic:NoSchedule'
      'cambridgesemantics.com/node-purpose': 'dynamic'
    asgMetricsCollection:
      - granularity: 1Minute
        metrics:
        - GroupPendingInstances
        - GroupInServiceInstances
        - GroupTerminatingInstances
        - GroupInServiceCapacity

Create the Node Groups

After defining the requirements for the node groups, run the create_nodepools.sh script in the eksctl directory to create each type of node group. Run the script once for each type of group.

The create_nodepools.sh script references the files in the eksctl/reference directory. If you customized the directory structure on the workstation, ensure that the reference directory is available at the same level as create_nodepools.sh before creating the node groups.

Run the script with the following command. The arguments are described below.

./create_nodepools.sh -c <config_file_name> [ -d <config_file_directory> ] [ -f | --force ] [ -h | --help ]

It is important to create the Common node group first. The Cluster Autoscaler and other core cluster services are dependent on the Common node group.

-c <config_file_name>

This is a required argument that specifies the name of the configuration file (i.e., nodepool_common.yaml, nodepool_operator.yaml, nodepool_anzograph.yaml, or nodepool_dynamic.yaml) that supplies the node group requirements. For example, -c nodepool_dynamic.yaml.

-d <config_file_directory>

This is an optional argument that specifies the path and directory name for the configuration file specified for the -c argument. If you are using the original eksctl directory file structure and the configuration file is in the conf.d directory, you do not need to specify the -d argument. If you created a separate directory structure for different Anzo environments, include the -d option. For example, -d /eksctl/env1/conf.

-f | --force

This is an optional argument that controls whether the script prompts for confirmation before proceeding with each stage involved in creating the node group. If -f (--force) is specified, the script assumes the answer is "yes" to all prompts and does not display them.

-h | --help

This argument is an optional flag that you can specify to display the help from the create_nodepools.sh script.

For example, the following command runs the create_nodepools script, using nodepool_common.yaml as input to the script. Since nodepool_common.yaml is in the conf.d directory, the -d argument is excluded:

./create_nodepools.sh -c nodepool_common.yaml

The script validates that the required software packages, such as aws-cli, eksctl, and kubectl, are installed and that the versions are compatible with the script. It also displays an overview of the deployment details based on the values in the specified configuration file.

The script then prompts you to proceed with deploying each component of the node group. Type y and press Enter to proceed with the configuration.

Once the Common, Operator, AnzoGraph, and Dynamic node groups are created, the next step is to create a Cloud Location in Anzo so that Anzo can connect to the EKS cluster and deploy applications. See Connecting to a Cloud Location.

Related Topics