Creating the GKE Cluster
Follow the instructions below to define the GKE cluster resource requirements and then create the cluster based on your specifications.
Define the GKE Cluster Requirements
The first step in creating the K8s cluster is to define the infrastructure specifications. The configuration file to use for defining the specifications is called k8s_cluster.conf. Multiple sample k8s_cluster.conf files are included in the gcloud directory. Any of them can be copied and used as templates, or the files can be edited directly.
Sample k8s_cluster.conf Files
To help guide you in choosing the appropriate template for your use case, this section describes each of the sample files. Details about the parameters in the sample files are included in Cluster Parameters below.
gcloud/conf.d/k8s_cluster.conf
This file is a non-specific use case. It includes sample values for all of the available cluster parameters.
gcloud/sample_use_cases/1_usePrivateEndpoint_private_cluster/k8s_cluster.conf
This file includes sample values for a use case where:
- The GKE cluster will be deployed in a new private subnet in an existing network. You specify the existing network name in the
GCLOUD_NETWORK
parameter. - A NAT gateway is deployed with a private endpoint (
GKE_ENABLE_PRIVATE_ENDPOINT=true
,GKE_ENABLE_PRIVATE_ENDPOINT=true
,GKE_PRIVATE_ACCESS=true
). There is no client access to the public endpoint. - Secondary IP ranges are added to the NAT mapping along with the primary IP when
NETWORK_NAT_ALLOW_SUBNET_SECONDARY_IPS=true
. Outbound connectivity is allowed through the NAT gateway but restricted to the IP ranges specified in theGKE_MASTER_ACCESS_CIDRS
parameter.
gcloud/sample_use_cases/2_public_cluster/k8s_cluster.conf
This file includes sample values for a use case where:
- A new network with public and private subnetworks will be created and the GKE cluster will be deployed into it.
- The cluster is public (
GKE_PRIVATE_ACCESS=false
).
gcloud/sample_use_cases/3_useAuthorizedNetworks/k8s_cluster.conf
This file includes sample values for a use case where:
- The GKE cluster will be deployed in a new or existing network with public and private subnets.
- The
GKE_MASTER_ACCESS_CIDRS
parameter is used to limit the access to the public endpoint.
gcloud/sample_use_cases/4_providePublicEndpointAccess/k8s_cluster.conf
This file includes sample values for a use case where:
- The GKE cluster will be deployed as a private cluster with public endpoint access enabled (
GKE_ENABLE_PRIVATE_ENDPOINT=false
).
Cluster Parameters
The contents of k8s_cluster.conf
are shown below. Descriptions of the cluster parameters follow the contents.
NETWORK_BGP_ROUTING="<bgp-routing-mode>"
NETWORK_SUBNET_MODE="<subnet-mode>"
NETWORK_ROUTER_NAME="<router>"
NETWORK_ROUTER_MODE="<advertisement-mode>"
NETWORK_ROUTER_ASN=<asn>
NETWORK_ROUTER_DESC="<description>"
NETWORK_NAT_NAME="<nat-name>"
NETWORK_NAT_UDP_IDLE_TIMEOUT="<udp-idle-timeout>"
NETWORK_NAT_ICMP_IDLE_TIMEOUT="<icmp-idle-timeout>"
NETWORK_NAT_TCP_ESTABLISHED_IDLE_TIMEOUT="<tcp-established-idle-timeout>"
NETWORK_NAT_TCP_TRANSITORY_IDLE_TIMEOUT="<tcp-transitory-idle-timeout>" NETWORK_NAT_ALLOW_SUBNET_SECONDARY_IPS=<allow-subnet-secondary-ips>
K8S_CLUSTER_NAME=${K8S_CLUSTER_NAME:-"<cluster-name>"}
K8S_CLUSTER_PODS_PER_NODE="<default-max-pods-per-node>"
K8S_CLUSTER_ADDONS="<addons>"
GKE_MASTER_VERSION="<cluster-version>"
GKE_PRIVATE_ACCESS=<enable-private-nodes>
GKE_MASTER_NODE_COUNT_PER_LOCATION=<num-nodes>
GKE_NODE_VERSION="<node-version>"
GKE_IMAGE_TYPE="<image-type>"
GKE_MAINTENANCE_WINDOW='<maintenance-window>'
GKE_ENABLE_PRIVATE_ENDPOINT=<enable-private-endpoint>
GKE_MASTER_ACCESS_CIDRS="<master-authorized-networks>"
K8S_PRIVATE_CIDR="<cluster-ipv4-cidr>"
K8S_SERVICES_CIDR="<services-ipv4-cidr>"
GCLOUD_NODES_CIDR="<create-subnetwork>"
K8S_API_CIDR="<master-ipv4-cidr>"
K8S_HOST_DISK_SIZE='<disk-size>'
K8S_HOST_DISK_TYPE="<disk-type>"
K8S_HOST_MIN_CPU_PLATFORM="<min-cpu-platform>"
K8S_POOL_HOSTS_MAX=<max-nodes-per-pool>
K8S_METADATA="<metadata>"
K8S_MIN_NODES=<min-nodes>
K8S_MAX_NODES=<max-nodes>
GCLOUD_RESOURCE_LABELS='<labels>'
GCLOUD_VM_LABELS=<node-labels>
GCLOUD_VM_TAGS="<tags>"
GCLOUD_VM_MACHINE_TYPE="<machine-type>"
GCLOUD_VM_SSD_COUNT=<local-ssd-count>
GCLOUD_PROJECT_ID=${GCLOUD_PROJECT_ID:-"<project>"}
GCLOUD_NETWORK=${GCLOUD_NETWORK:-"<network>"}
GCLOUD_NODES_SUBNET_SUFFIX="<suffix>"
GCLOUD_CLUSTER_REGION=${GCLOUD_CLUSTER_REGION:-"<region>"}
GCLOUD_NODE_LOCATIONS="<node-locations>"
GCLOUD_NODE_TAINTS='<node-taints>'
GCLOUD_NODE_SCOPE='<scopes>'
Parameter | Description |
---|---|
NETWORK_BGP_ROUTING | The mode the Cloud Router will use to advertise BGP routes when the network is created, i.e, whether the cluster is global or regional. This parameter maps to the gcloud Cloud Router --bgp-routing-mode option. The default value is regional. |
NETWORK_SUBNET_MODE | The method to use when subnets are created. Valid values are "auto" or "custom." This parameter maps to the gcloud VPC --subnet-mode option. The recommended value is custom. |
NETWORK_ROUTER_NAME | The name to assign to the Cloud Router. For example, csi-cloudrouter. |
NETWORK_ROUTER_MODE | The route advertisement mode for the Cloud Router. This parameter maps to the gcloud Cloud Router --advertisement-mode option. The recommended value is custom. |
NETWORK_ROUTER_ASN | The Border Gateway Protocol (BGP) autonomous system number (ASN). When a router is created, it is assigned an ASN. This parameter maps to the gcloud Cloud Router --asn option. Coordinate with your network administrator to determine the number to specify. |
NETWORK_ROUTER_DESC | A description of the Cloud Router. This parameter maps to the gcloud Cloud Router --description option. For example, Cloud router for K8S NAT. |
NETWORK_NAT_NAME | The name to assign to the NAT gateway. For example, csi-natgw. |
NETWORK_NAT_UDP_IDLE_TIMEOUT | The timeout value for UDP connections to the NAT gateway. This parameter maps to the gcloud NAT router --udp-idle-timeout option. The default value in k8s_cluster.conf is 60s (60 seconds). For information about duration formats, refer to gcloud topic datetimes in the Cloud SDK documentation. |
NETWORK_NAT_ICMP_IDLE_TIMEOUT | The timeout value for ICMP connections to the NAT gateway. This parameter maps to the gcloud NAT router --icmp-idle-timeout option. The default value in k8s_cluster.conf is 60s (60 seconds). |
NETWORK_NAT_TCP_ESTABLISHED_IDLE_TIMEOUT | The timeout value for TCP established connections to the NAT gateway. This parameter maps to the gcloud NAT router --tcp-established-idle-timeout option. The default value in k8s_cluster.conf is 60s (60 seconds). |
NETWORK_NAT_TCP_TRANSITORY_IDLE_TIMEOUT | The timeout value to use for TCP transitory connections to the NAT gateway. This parameter maps to the gcloud NAT router --tcp-transitory-idle-timeout option. The default value in k8s_cluster.conf is 60s (60 seconds). |
NETWORK_NAT_ALLOW_SUBNET_SECONDARY_IPS | Indicates whether to allow all secondary IP ranges for the GKE cluster to use the NAT gateway. If true, the secondary IP ranges for the subnets will have NAT gateway access. |
K8S_CLUSTER_NAME | The name to give to the cluster. For example, csi-k8s-cluster. |
K8S_CLUSTER_PODS_PER_NODE | The maximum number of pods that can be hosted on each compute instance. This parameter maps to the gcloud container cluster --default-max-pods-per-node option. This value also applies to the node pools in the cluster if the node pool configuration does not specify the maximum number of pods per node. Cambridge Semantics recommends that you set this value to 16. |
K8S_CLUSTER_ADDONS | A comma-separated list of any additional Kubernetes cluster components to enable for the cluster. This parameter maps to the gcloud container cluster --addons option. By default, the k8s_cluster.conf file lists HttpLoadBalancing and HorizontalPodAutoscaling. Cambridge Semantics recommends that you include both of these components as a best practice. |
GKE_MASTER_VERSION | The Kubernetes version to use for the GKE cluster. This parameter maps to the gcloud container cluster --cluster-version option. |
GKE_PRIVATE_ACCESS | Indicates whether the cluster's nodes should have external IP addresses. When GKE_PRIVATE_ACCESS=true, the cluster remains private and nodes are not assigned external IP addresses. This parameter maps to the GKE --enable-private-nodes option. |
GKE_MASTER_NODE_COUNT_PER_LOCATION | The number of nodes to create for running the K8s services in the default node pool in each of the cluster's zones. This value must be at least 1. For high availability, Cambridge Semantics recommends setting this value to 3. This parameter maps to the gcloud container cluster --num-nodes option. |
GKE_NODE_VERSION | The Kubernetes version to use for nodes in the node pools. This parameter maps to the gcloud container cluster --node-version option. Cambridge Semantics recommends that you specify the same version as the GKE_MASTER_VERSION. |
GKE_IMAGE_TYPE | The base operating system that the nodes in the cluster will run on. This parameter maps to the gcloud container cluster --image-type option. This value must be COS. |
GKE_MAINTENANCE_WINDOW | The time of day to start maintenance on this cluster. This parameter maps to the gcloud container cluster --maintenance-window option. The time corresponds to the UTC time zone and must be in HH:MM format. The default value in k8s_cluster.conf is 06:00 (6:00 am). |
GKE_ENABLE_PRIVATE_ENDPOINT | Indicates whether to use a private or public IP address for the master API endpoint. When GKE_ENABLE_PRIVATE_ENDPOINT=true, the IP address for the API endpoint is private. This parameter maps to the GKE --enable-private-endpoint option. |
GKE_MASTER_ACCESS_CIDRS | The list of CIDR blocks (up to 50) that are allowed to connect to the GKE cluster over HTTPS. This value should include the Anzo subnet CIDR so that Anzo has access to the GKE cluster. This parameter maps to the gcloud container cluster --master-authorized-networks option. For example, 10.128.0.0/9. |
K8S_PRIVATE_CIDR | The IP address range (in CIDR notation) for the pods in this cluster. This parameter maps to the gcloud container cluster --cluster-ipv4-cidr option. For example, 172.16.0.0/20. |
K8S_SERVICES_CIDR | The IP address range for the cluster services. This parameter maps to the gcloud container cluster --services-ipv4-cidr option. For example: 172.17.0.0/20. |
GCLOUD_NODES_CIDR | The CIDR for the new subnet that will be created for the K8s cluster. This parameter maps to the --create-subnetwork option For example, 192.168.0.0/20. |
K8S_API_CIDR | The IPv4 CIDR range to use for the master network. The range should have a subnet mask of /28. This parameter maps to the gcloud container cluster --master-ipv4-cidr option. For example, 192.171.0.0/28. |
K8S_HOST_DISK_SIZE | The size of the boot disks on the cluster compute instances. This parameter maps to the gcloud container cluster --disk-size option. For example, 50GB. |
K8S_HOST_DISK_TYPE | The type of boot disk to use. This parameter maps to the gcloud container cluster --disk-type option. For example, pd-standard. |
K8S_HOST_MIN_CPU_PLATFORM | The minimum CPU platform to use. This parameter maps to the gcloud container cluster --min-cpu-platform option. This value is left blank in the k8s_cluster.conf file. |
K8S_POOL_HOSTS_MAX | The maximum number of nodes to allocate for the default initial node pool. This parameter maps to the gcloud container cluster --max-nodes-per-pool option. The default value is 1000, but it can be set as low as 100 for the initial creation. |
K8S_METADATA | The compute engine metadata (in the format key=val,key=val) to make available to the guest operating system running on nodes in the node pools. This parameter maps to the gcloud container cluster --metadata option.Including disable-legacy-endpoints=true is required to ensure that legacy metadata APIs are disabled. For more information about the option, see Protecting Cluster Metadata in the GKE documentation. |
K8S_MIN_NODES | The minimum number of nodes in the default node pool. This parameter maps to the gcloud container cluster --min-nodes option. For example, 1. |
K8S_MAX_NODES | The maximum number of nodes in the default node pool. This parameter maps to the gcloud container cluster --max-nodes option. For example, 3. |
GCLOUD_RESOURCE_LABELS | A comma-separated list of any labels that you want to apply to the Google Cloud resources in use by the GKE cluster (unrelated to Kubernetes labels). |
GCLOUD_VM_LABELS | A comma-separated list of any Kubernetes labels to apply to nodes in the default node pool. This parameter maps to the gcloud container cluster --node-labels option. |
GCLOUD_VM_TAGS | A comma-separated list of strings to add to the instances in the cluster to classify the VMs. This parameter maps to the gcloud container cluster --tags option. |
GCLOUD_VM_MACHINE_TYPE | The machine type to use for the GKE cluster nodes. This parameter maps to the gcloud container cluster --machine-type option. For example, n1-standard-1. |
GCLOUD_VM_SSD_COUNT | The number of local SSD disks to add to each node. This parameter maps to the gcloud container cluster --local-ssd-count option. For example, specify 0 if you do not want to add SSDs to the nodes. |
GCLOUD_PROJECT_ID | The Project ID for the GKE cluster. This parameter maps to the gcloud-wide --project option. For example, cloud-project-1592. |
GCLOUD_NETWORK | The network to provision the GKE cluster in. This value should match the name of the network that Anzo is deployed in. This parameter maps to the gcloud container cluster --network option. For example, devel-network.If you want gcloud to create a new network, you can leave this value blank. However, after deploying the GKE cluster, you must configure the new network so that it is routable from the Anzo network. |
GCLOUD_NODES_SUBNET_SUFFIX | The suffix to add to the subnetworks. For example, nodes. |
GCLOUD_CLUSTER_REGION | The compute region for the GKE cluster. This value should match the name of the region that Anzo is deployed in. This parameter maps to the gcloud container cluster --region option. For example, us-central1. |
GCLOUD_NODE_LOCATIONS | A comma-separated list of any zones to replicate the nodes in. This parameter maps to the gcloud container cluster --node-locations option. For example, us-central1-f. |
GCLOUD_NODE_TAINTS | A comma-separated list of the Kubernetes taints for the nodes in the default node pool. When a pod is scheduled for deployment, the scheduler relies on this information to find the node pool that the pod belongs in. A pod has a toleration that identifies whether it is compatible with a node taint. This parameter maps to the gcloud container cluster --node-taints option. For more information, see Controlling Scheduling with Node Taints in the GKE documentation. |
GCLOUD_NODE_SCOPE | A comma-separated list of the access scopes the nodes should have. This parameter maps to the gcloud container cluster --scopes option. For example, gke-default. |
Example Configuration File
An example completed k8s_cluster.conf file is shown below.
NETWORK_BGP_ROUTING="regional" NETWORK_SUBNET_MODE="custom" NETWORK_ROUTER_NAME="csi-cloudrouter" NETWORK_ROUTER_MODE="custom" NETWORK_ROUTER_ASN=64512 NETWORK_ROUTER_DESC="Cloud router for K8S NAT." NETWORK_NAT_NAME="csi-natgw" NETWORK_NAT_UDP_IDLE_TIMEOUT="60s" NETWORK_NAT_ICMP_IDLE_TIMEOUT="60s" NETWORK_NAT_TCP_ESTABLISHED_IDLE_TIMEOUT="60s" NETWORK_NAT_TCP_TRANSITORY_IDLE_TIMEOUT="60s" NETWORK_NAT_ALLOW_SUBNET_SECONDARY_IPS=false K8S_CLUSTER_NAME=${K8S_CLUSTER_NAME:-"csi-k8s-cluster"} K8S_CLUSTER_PODS_PER_NODE="16" K8S_CLUSTER_ADDONS="HttpLoadBalancing,HorizontalPodAutoscaling" GKE_MASTER_VERSION="1.19.9-gke.1900" GKE_PRIVATE_ACCESS=true GKE_MASTER_NODE_COUNT_PER_LOCATION=1 GKE_NODE_VERSION="1.19.9-gke.1900" GKE_IMAGE_TYPE="COS" GKE_MAINTENANCE_WINDOW='06:00' GKE_ENABLE_PRIVATE_ENDPOINT=true GKE_MASTER_ACCESS_CIDRS="10.128.0.0/9" K8S_PRIVATE_CIDR="172.16.0.0/20" K8S_SERVICES_CIDR="172.17.0.0/20" GCLOUD_NODES_CIDR="192.168.0.0/20" K8S_API_CIDR="192.171.0.0/28" K8S_HOST_DISK_SIZE='50GB' K8S_HOST_DISK_TYPE="pd-standard" K8S_HOST_MIN_CPU_PLATFORM="" K8S_POOL_HOSTS_MAX=1000 K8S_METADATA="disable-legacy-endpoints=true" K8S_MIN_NODES=1 K8S_MAX_NODES=3 GCLOUD_RESOURCE_LABELS='deleteafter=false,owner=user' GCLOUD_VM_LABELS=description=k8s_cluster GCLOUD_VM_TAGS="cluster-vm" GCLOUD_VM_MACHINE_TYPE="n1-standard-1" GCLOUD_VM_SSD_COUNT=0 GCLOUD_PROJECT_ID=${GCLOUD_PROJECT_ID:-"cloud-project-1592"} GCLOUD_NETWORK=${GCLOUD_NETWORK:-"devel-network"} GCLOUD_NODES_SUBNET_SUFFIX="nodes" GCLOUD_CLUSTER_REGION=${GCLOUD_CLUSTER_REGION:-"us-central1"} GCLOUD_NODE_LOCATIONS="us-central1-f" GCLOUD_NODE_TAINTS='key1=val1:NoSchedule,key2=val2:PreferNoSchedule' GCLOUD_NODE_SCOPE='gke-default'
Create the GKE Cluster
After defining the cluster requirements, run the create_k8s.sh script in the gcloud
directory to create the cluster. Run the script with the following command. The arguments are described below.
./create_k8s.sh -c <config_file_name> [ -d <config_file_directory> ] [ -f | --force ] [ -h | --help ]
For example, the following command runs the create_k8s script, using k8s_cluster.conf as input to the script. Since k8s_cluster.conf is in the conf.d directory, the -d argument is excluded:
./create_k8s.sh -c k8s_cluster.conf
The script validates that the required software packages, such as the gcloud sdk and kubectl, are installed and that the versions are compatible with the deployment. It also displays an overview of the deployment details based on the values in the specified configuration file. For example:
Operating System : CentOS Linux - Google Cloud SDK: 322.0.0 alpha: 2021.01.05 beta: 2021.01.05 bq: 2.0.64 core: 2021.01.05 gsutil: 4.57 kubectl cli version: Client Version: v1.19.12 valid Deployment details: Project : cloud-project-1592 Region : us-central1 GKE Cluster : cloud-k8s-cluster GKE Master version : 1.19.9-gke.1900
The script then prompts you to proceed with deploying each component of the GKE cluster infrastructure. Type y and press Enter to proceed with creating the specified network, cluster, cloud router, and NAT gateway components. All components are created according to the specifications in the configuration file.
When cluster creation is complete, proceed to Creating the Required Node Pools to add the required node pools to the cluster.