Creating the GKE Cluster

Follow the instructions below to define the GKE cluster resource requirements and then create the cluster based on your specifications.

Define the GKE Cluster Requirements

The first step in creating the K8s cluster is to define the infrastructure specifications. The configuration file to use for defining the specifications is called k8s_cluster.conf. Multiple sample k8s_cluster.conf files are included in the gcloud directory. Any of them can be copied and used as templates, or the files can be edited directly.

Sample k8s_cluster.conf Files

To help guide you in choosing the appropriate template for your use case, this section describes each of the sample files. Details about the parameters in the sample files are included in Cluster Parameters below.

gcloud/conf.d/k8s_cluster.conf

This file is a non-specific use case. It includes sample values for all of the available cluster parameters.

gcloud/sample_use_cases/1_usePrivateEndpoint_private_cluster/k8s_cluster.conf

This file includes sample values for a use case where:

  • The GKE cluster will be deployed in a new private subnet in an existing network. You specify the existing network name in the GCLOUD_NETWORK parameter.
  • A NAT gateway is deployed with a private endpoint (GKE_ENABLE_PRIVATE_ENDPOINT=true, GKE_ENABLE_PRIVATE_ENDPOINT=true, GKE_PRIVATE_ACCESS=true). There is no client access to the public endpoint.
  • Secondary IP ranges are added to the NAT mapping along with the primary IP when NETWORK_NAT_ALLOW_SUBNET_SECONDARY_IPS=true. Outbound connectivity is allowed through the NAT gateway but restricted to the IP ranges specified in the GKE_MASTER_ACCESS_CIDRS parameter.

gcloud/sample_use_cases/2_public_cluster/k8s_cluster.conf

This file includes sample values for a use case where:

  • A new network with public and private subnetworks will be created and the GKE cluster will be deployed into it.
  • The cluster is public (GKE_PRIVATE_ACCESS=false).

gcloud/sample_use_cases/3_useAuthorizedNetworks/k8s_cluster.conf

This file includes sample values for a use case where:

  • The GKE cluster will be deployed in a new or existing network with public and private subnets.
  • The GKE_MASTER_ACCESS_CIDRS parameter is used to limit the access to the public endpoint.

gcloud/sample_use_cases/4_providePublicEndpointAccess/k8s_cluster.conf

This file includes sample values for a use case where:

  • The GKE cluster will be deployed as a private cluster with public endpoint access enabled (GKE_ENABLE_PRIVATE_ENDPOINT=false).

Cluster Parameters

The contents of k8s_cluster.conf are shown below. Descriptions of the cluster parameters follow the contents.

NETWORK_BGP_ROUTING="<bgp-routing-mode>"
NETWORK_SUBNET_MODE="<subnet-mode>"
NETWORK_ROUTER_NAME="<router>"
NETWORK_ROUTER_MODE="<advertisement-mode>"
NETWORK_ROUTER_ASN=<asn>
NETWORK_ROUTER_DESC="<description>"
NETWORK_NAT_NAME="<nat-name>"
NETWORK_NAT_UDP_IDLE_TIMEOUT="<udp-idle-timeout>"
NETWORK_NAT_ICMP_IDLE_TIMEOUT="<icmp-idle-timeout>"
NETWORK_NAT_TCP_ESTABLISHED_IDLE_TIMEOUT="<tcp-established-idle-timeout>"
NETWORK_NAT_TCP_TRANSITORY_IDLE_TIMEOUT="<tcp-transitory-idle-timeout>" NETWORK_NAT_ALLOW_SUBNET_SECONDARY_IPS=<allow-subnet-secondary-ips>
K8S_CLUSTER_NAME=${K8S_CLUSTER_NAME:-"<cluster-name>"}
K8S_CLUSTER_PODS_PER_NODE="<default-max-pods-per-node>"
K8S_CLUSTER_ADDONS="<addons>"
GKE_MASTER_VERSION="<cluster-version>"
GKE_PRIVATE_ACCESS=<enable-private-nodes>
GKE_MASTER_NODE_COUNT_PER_LOCATION=<num-nodes>
GKE_NODE_VERSION="<node-version>"
GKE_IMAGE_TYPE="<image-type>"
GKE_MAINTENANCE_WINDOW='<maintenance-window>'
GKE_ENABLE_PRIVATE_ENDPOINT=<enable-private-endpoint>
GKE_MASTER_ACCESS_CIDRS="<master-authorized-networks>"
K8S_PRIVATE_CIDR="<cluster-ipv4-cidr>"
K8S_SERVICES_CIDR="<services-ipv4-cidr>"
GCLOUD_NODES_CIDR="<create-subnetwork>"
K8S_API_CIDR="<master-ipv4-cidr>"
K8S_HOST_DISK_SIZE='<disk-size>'
K8S_HOST_DISK_TYPE="<disk-type>"
K8S_HOST_MIN_CPU_PLATFORM="<min-cpu-platform>"
K8S_POOL_HOSTS_MAX=<max-nodes-per-pool>
K8S_METADATA="<metadata>"
K8S_MIN_NODES=<min-nodes>
K8S_MAX_NODES=<max-nodes>
GCLOUD_RESOURCE_LABELS='<labels>'
GCLOUD_VM_LABELS=<node-labels>
GCLOUD_VM_TAGS="<tags>"
GCLOUD_VM_MACHINE_TYPE="<machine-type>"
GCLOUD_VM_SSD_COUNT=<local-ssd-count>
GCLOUD_PROJECT_ID=${GCLOUD_PROJECT_ID:-"<project>"}
GCLOUD_NETWORK=${GCLOUD_NETWORK:-"<network>"}
GCLOUD_NODES_SUBNET_SUFFIX="<suffix>"
GCLOUD_CLUSTER_REGION=${GCLOUD_CLUSTER_REGION:-"<region>"}
GCLOUD_NODE_LOCATIONS="<node-locations>"
GCLOUD_NODE_TAINTS='<node-taints>'
GCLOUD_NODE_SCOPE='<scopes>'

NETWORK_BGP_ROUTING

The mode the Cloud Router will use to advertise BGP routes when the network is created, i.e, whether the cluster is global or regional. This parameter maps to the gcloud Cloud Router --bgp-routing-mode option. The default value is regional.

NETWORK_SUBNET_MODE

The method to use when subnets are created. Valid values are "auto" or "custom." This parameter maps to the gcloud VPC --subnet-mode option. The recommended value is custom.

NETWORK_ROUTER_NAME

The name to assign to the Cloud Router. For example, csi-cloudrouter.

NETWORK_ROUTER_MODE

The route advertisement mode for the Cloud Router. This parameter maps to the gcloud Cloud Router --advertisement-mode option. The recommended value is custom.

NETWORK_ROUTER_ASN

The Border Gateway Protocol (BGP) autonomous system number (ASN). When a router is created, it is assigned an ASN. This parameter maps to the gcloud Cloud Router --asn option. Coordinate with your network administrator to determine the number to specify.

NETWORK_ROUTER_DESC

A description of the Cloud Router. This parameter maps to the gcloud Cloud Router --description option. For example, Cloud router for K8S NAT.

NETWORK_NAT_NAME

The name to assign to the NAT gateway. For example, csi-natgw.

NETWORK_NAT_UDP_IDLE_TIMEOUT

The timeout value for UDP connections to the NAT gateway. This parameter maps to the gcloud NAT router --udp-idle-timeout option. The default value in k8s_cluster.conf is 60s (60 seconds). For information about duration formats, refer to gcloud topic datetimes in the Cloud SDK documentation.

NETWORK_NAT_ICMP_IDLE_TIMEOUT

The timeout value for ICMP connections to the NAT gateway. This parameter maps to the gcloud NAT router --icmp-idle-timeout option. The default value in k8s_cluster.conf is 60s (60 seconds).

NETWORK_NAT_TCP_ESTABLISHED_IDLE_TIMEOUT

The timeout value for TCP established connections to the NAT gateway. This parameter maps to the gcloud NAT router --tcp-established-idle-timeout option. The default value in k8s_cluster.conf is 60s (60 seconds).

NETWORK_NAT_TCP_TRANSITORY_IDLE_TIMEOUT

The timeout value to use for TCP transitory connections to the NAT gateway. This parameter maps to the gcloud NAT router --tcp-transitory-idle-timeout option. The default value in k8s_cluster.conf is 60s (60 seconds).

NETWORK_NAT_ALLOW_SUBNET_SECONDARY_IPS

Indicates whether to allow all secondary IP ranges for the GKE cluster to use the NAT gateway. If true, the secondary IP ranges for the subnets will have NAT gateway access.

K8S_CLUSTER_NAME

The name to give to the cluster. For example, csi-k8s-cluster.

K8S_CLUSTER_PODS_PER_NODE

The maximum number of pods that can be hosted on each compute instance. This parameter maps to the gcloud container cluster --default-max-pods-per-node option. This value also applies to the node pools in the cluster if the node pool configuration does not specify the maximum number of pods per node. Cambridge Semantics recommends that you set this value to 16.

K8S_CLUSTER_ADDONS

A comma-separated list of any additional Kubernetes cluster components to enable for the cluster. This parameter maps to the gcloud container cluster --addons option. By default, the k8s_cluster.conf file lists HttpLoadBalancing and HorizontalPodAutoscaling. Cambridge Semantics recommends that you include both of these components as a best practice.

GKE_MASTER_VERSION

The Kubernetes version to use for the GKE cluster. This parameter maps to the gcloud container cluster --cluster-version option.

GKE_PRIVATE_ACCESS

Indicates whether the cluster's nodes should have external IP addresses. When GKE_PRIVATE_ACCESS=true, the cluster remains private and nodes are not assigned external IP addresses. This parameter maps to the GKE --enable-private-nodes option.

GKE_MASTER_NODE_COUNT_PER_LOCATION

The number of nodes to create for running the K8s services in the default node pool in each of the cluster's zones. This value must be at least 1. For high availability, Cambridge Semantics recommends setting this value to 3. This parameter maps to the gcloud container cluster --num-nodes option.

GKE_NODE_VERSION

The Kubernetes version to use for nodes in the node pools. This parameter maps to the gcloud container cluster --node-version option.

Cambridge Semantics recommends that you specify the same version as the GKE_MASTER_VERSION.

GKE_IMAGE_TYPE

The base operating system that the nodes in the cluster will run on. This parameter maps to the gcloud container cluster --image-type option. This value must be COS.

GKE_MAINTENANCE_WINDOW

The time of day to start maintenance on this cluster. This parameter maps to the gcloud container cluster --maintenance-window option. The time corresponds to the UTC time zone and must be in HH:MM format. The default value in k8s_cluster.conf is 06:00 (6:00 am).

GKE_ENABLE_PRIVATE_ENDPOINT

Indicates whether to use a private or public IP address for the master API endpoint. When GKE_ENABLE_PRIVATE_ENDPOINT=true, the IP address for the API endpoint is private. This parameter maps to the GKE --enable-private-endpoint option.

GKE_MASTER_ACCESS_CIDRS

The list of CIDR blocks (up to 50) that are allowed to connect to the GKE cluster over HTTPS. This value should include the Anzo subnet CIDR so that Anzo has access to the GKE cluster. This parameter maps to the gcloud container cluster --master-authorized-networks option. For example, 10.128.0.0/9.

K8S_PRIVATE_CIDR

The IP address range (in CIDR notation) for the pods in this cluster. This parameter maps to the gcloud container cluster --cluster-ipv4-cidr option. For example, 172.16.0.0/20.

K8S_SERVICES_CIDR

The IP address range for the cluster services. This parameter maps to the gcloud container cluster --services-ipv4-cidr option. For example: 172.17.0.0/20.

GCLOUD_NODES_CIDR

The CIDR for the new subnet that will be created for the K8s cluster. This parameter maps to the --create-subnetwork option For example, 192.168.0.0/20.

K8S_API_CIDR

The IPv4 CIDR range to use for the master network. The range should have a subnet mask of /28. This parameter maps to the gcloud container cluster --master-ipv4-cidr option. For example, 192.171.0.0/28.

K8S_HOST_DISK_SIZE

The size of the boot disks on the cluster compute instances. This parameter maps to the gcloud container cluster --disk-size option. For example, 50GB.

K8S_HOST_DISK_TYPE

The type of boot disk to use. This parameter maps to the gcloud container cluster --disk-type option. For example, pd-standard.

K8S_HOST_MIN_CPU_PLATFORM

The minimum CPU platform to use. This parameter maps to the gcloud container cluster --min-cpu-platform option. This value is left blank in the k8s_cluster.conf file.

K8S_POOL_HOSTS_MAX

The maximum number of nodes to allocate for the default initial node pool. This parameter maps to the gcloud container cluster --max-nodes-per-pool option. The default value is 1000, but it can be set as low as 100 for the initial creation.

K8S_METADATA

The compute engine metadata (in the format key=val,key=val) to make available to the guest operating system running on nodes in the node pools. This parameter maps to the gcloud container cluster --metadata option.

Including disable-legacy-endpoints=true is required to ensure that legacy metadata APIs are disabled. For more information about the option, see Protecting Cluster Metadata in the GKE documentation.

K8S_MIN_NODES

The minimum number of nodes in the default node pool. This parameter maps to the gcloud container cluster --min-nodes option. For example, 1.

K8S_MAX_NODES

The maximum number of nodes in the default node pool. This parameter maps to the gcloud container cluster --max-nodes option. For example, 3.

GCLOUD_RESOURCE_LABELS

A comma-separated list of any labels that you want to apply to the Google Cloud resources in use by the GKE cluster (unrelated to Kubernetes labels).

GCLOUD_VM_LABELS

A comma-separated list of any Kubernetes labels to apply to nodes in the default node pool. This parameter maps to the gcloud container cluster --node-labels option.

GCLOUD_VM_TAGS

A comma-separated list of strings to add to the instances in the cluster to classify the VMs. This parameter maps to the gcloud container cluster --tags option.

GCLOUD_VM_MACHINE_TYPE

The machine type to use for the GKE cluster nodes. This parameter maps to the gcloud container cluster --machine-type option. For example, n1-standard-1.

GCLOUD_VM_SSD_COUNT

The number of local SSD disks to add to each node. This parameter maps to the gcloud container cluster --local-ssd-count option. For example, specify 0 if you do not want to add SSDs to the nodes.

GCLOUD_PROJECT_ID

The Project ID for the GKE cluster. This parameter maps to the gcloud-wide --project option. For example, cloud-project-1592.

GCLOUD_NETWORK

The network to provision the GKE cluster in. This value should match the name of the network that Anzo is deployed in. This parameter maps to the gcloud container cluster --network option. For example, devel-network.

If you want gcloud to create a new network, you can leave this value blank. However, after deploying the GKE cluster, you must configure the new network so that it is routable from the Anzo network.

GCLOUD_NODES_SUBNET_SUFFIX

The suffix to add to the subnetworks. For example, nodes.

GCLOUD_CLUSTER_REGION

The compute region for the GKE cluster. This value should match the name of the region that Anzo is deployed in. This parameter maps to the gcloud container cluster --region option. For example, us-central1.

GCLOUD_NODE_LOCATIONS

A comma-separated list of any zones to replicate the nodes in. This parameter maps to the gcloud container cluster --node-locations option. For example, us-central1-f.

GCLOUD_NODE_TAINTS

A comma-separated list of the Kubernetes taints for the nodes in the default node pool. When a pod is scheduled for deployment, the scheduler relies on this information to find the node pool that the pod belongs in. A pod has a toleration that identifies whether it is compatible with a node taint. This parameter maps to the gcloud container cluster --node-taints option. For more information, see Controlling Scheduling with Node Taints in the GKE documentation.

GCLOUD_NODE_SCOPE

A comma-separated list of the access scopes the nodes should have. This parameter maps to the gcloud container cluster --scopes option. For example, gke-default.

Example Configuration File

An example completed k8s_cluster.conf file is shown below.

NETWORK_BGP_ROUTING="regional"
NETWORK_SUBNET_MODE="custom"
NETWORK_ROUTER_NAME="csi-cloudrouter"
NETWORK_ROUTER_MODE="custom"
NETWORK_ROUTER_ASN=64512
NETWORK_ROUTER_DESC="Cloud router for K8S NAT."
NETWORK_NAT_NAME="csi-natgw"
NETWORK_NAT_UDP_IDLE_TIMEOUT="60s"
NETWORK_NAT_ICMP_IDLE_TIMEOUT="60s"
NETWORK_NAT_TCP_ESTABLISHED_IDLE_TIMEOUT="60s"
NETWORK_NAT_TCP_TRANSITORY_IDLE_TIMEOUT="60s"
NETWORK_NAT_ALLOW_SUBNET_SECONDARY_IPS=false
K8S_CLUSTER_NAME=${K8S_CLUSTER_NAME:-"csi-k8s-cluster"}
K8S_CLUSTER_PODS_PER_NODE="16"
K8S_CLUSTER_ADDONS="HttpLoadBalancing,HorizontalPodAutoscaling"
GKE_MASTER_VERSION="1.19.9-gke.1900"
GKE_PRIVATE_ACCESS=true
GKE_MASTER_NODE_COUNT_PER_LOCATION=1
GKE_NODE_VERSION="1.19.9-gke.1900"
GKE_IMAGE_TYPE="COS"
GKE_MAINTENANCE_WINDOW='06:00'
GKE_ENABLE_PRIVATE_ENDPOINT=true
GKE_MASTER_ACCESS_CIDRS="10.128.0.0/9"
K8S_PRIVATE_CIDR="172.16.0.0/20"
K8S_SERVICES_CIDR="172.17.0.0/20"
GCLOUD_NODES_CIDR="192.168.0.0/20"
K8S_API_CIDR="192.171.0.0/28"
K8S_HOST_DISK_SIZE='50GB'
K8S_HOST_DISK_TYPE="pd-standard"
K8S_HOST_MIN_CPU_PLATFORM=""
K8S_POOL_HOSTS_MAX=1000
K8S_METADATA="disable-legacy-endpoints=true"
K8S_MIN_NODES=1
K8S_MAX_NODES=3
GCLOUD_RESOURCE_LABELS='deleteafter=false,owner=user'
GCLOUD_VM_LABELS=description=k8s_cluster
GCLOUD_VM_TAGS="cluster-vm"
GCLOUD_VM_MACHINE_TYPE="n1-standard-1"
GCLOUD_VM_SSD_COUNT=0
GCLOUD_PROJECT_ID=${GCLOUD_PROJECT_ID:-"cloud-project-1592"}
GCLOUD_NETWORK=${GCLOUD_NETWORK:-"devel-network"}
GCLOUD_NODES_SUBNET_SUFFIX="nodes"
GCLOUD_CLUSTER_REGION=${GCLOUD_CLUSTER_REGION:-"us-central1"}
GCLOUD_NODE_LOCATIONS="us-central1-f"
GCLOUD_NODE_TAINTS='key1=val1:NoSchedule,key2=val2:PreferNoSchedule'
GCLOUD_NODE_SCOPE='gke-default'

Create the GKE Cluster

After defining the cluster requirements, run the create_k8s.sh script in the gcloud directory to create the cluster. Run the script with the following command. The arguments are described below.

./create_k8s.sh -c <config_file_name> [ -d <config_file_directory> ] [ -f | --force ] [ -h | --help ]

-c <config_file_name>

This is a required argument that specifies the name of the configuration file that supplies the cluster requirements. For example, -c k8s_cluster.conf.

-d <config_file_directory>

This is an optional argument that specifies the path and directory name for the configuration file specified for the -c argument. If you are using the original gcloud directory file structure and the configuration file is in the conf.d directory, you do not need to specify the -d argument. If you created a separate directory structure for different Anzo environments, include the -d option. For example, -d /gcloud/env1/conf.

-f | --force

This is an optional argument that controls whether the script prompts for confirmation before proceeding with each stage involved in creating the cluster. If -f (--force) is specified, the script assumes the answer is "yes" to all prompts and does not display them.

-h | --help

This argument is an optional flag that you can specify to display the help from the create_k8s.sh script.

For example, the following command runs the create_k8s script, using k8s_cluster.conf as input to the script. Since k8s_cluster.conf is in the conf.d directory, the -d argument is excluded:

./create_k8s.sh -c k8s_cluster.conf

The script validates that the required software packages, such as the gcloud sdk and kubectl, are installed and that the versions are compatible with the deployment. It also displays an overview of the deployment details based on the values in the specified configuration file. For example:

Operating System   : CentOS Linux
- Google Cloud SDK: 322.0.0
  alpha: 2021.01.05
  beta: 2021.01.05
  bq: 2.0.64
  core: 2021.01.05
  gsutil: 4.57
  kubectl cli version: Client Version: v1.19.12
  valid

Deployment details:
    Project            : cloud-project-1592
    Region             : us-central1
    GKE Cluster        : cloud-k8s-cluster
    GKE Master version : 1.19.9-gke.1900

The script then prompts you to proceed with deploying each component of the GKE cluster infrastructure. Type y and press Enter to proceed with creating the specified network, cluster, cloud router, and NAT gateway components. All components are created according to the specifications in the configuration file.

When cluster creation is complete, proceed to Creating the Required Node Pools to add the required node pools to the cluster.

Related Topics