Provisioning a AKS Cluster using Terraform within a Azure DevOps Pipeline
Originally derived from the Greek word “kubernetes” (which means to steer or navigate), and first engineered by Google almost 10 years ago to run their global enterprise based container workloads, Kubernetes seems to be the new hotness in DevOps and cloud architectures in recent times. As the premier container orchestration technology on the market, Kubernetes takes the other big current buzzword in IT infrastructure teems, Containers (i.e Docker etc…), and harnesses their flexibility and power on a massive scale for both public and private cloud installations.
In order to learn a bit more about Kubernetes, I undertook a near 30 hour Udemy video course in January 2022. This course entitled “Azure Kubernetes Service with Azure DevOps and Terraform” by Kalyan Reddy Daida offered an excellent theoretical and hands on learning experience with the course’s end goal being to provision an Azure Kubernetes Services (AKS) cluster using Terraform commands through an Azure DevOps pipeline. This course was also desirable to me for gaining some experience with Terraform which is the most common cloud agnostic provisioning tool nowadays (previously the only IAC language I had used were ARM templates).
For this blog post I will first talk about Kubernetes and the associated technologies that enable it whilst making the case for Azure’s implementation of it using AKS and the relative ease that it can be achieved with using Terraform and Azure DevOps. For the second half of this blog post, I will document the steps from the course I took to provision and deploy to a AKS cluster with Terraform commands on an Azure DevOps pipeline as evidence of gaining some experience on these exciting technologies.
Fundamentals of Docker and Kubernetes
Before talking about Kubernetes in detail, one must first discuss containers. Containers offer a way to package applications (or even parts of an application if talking micro-service architectures) along with their dependencies and runtimes into self-contained chunks that can be run across different computing environments while still expecting the same results (avoiding the common “runs on my machine” phenomenon in programming). They differ from traditional virtual machines (VM’s) as they are far more portable and lightweight (as they don’t have to have a separate operating system for each one) while still offering the same isolation. On a VM we would have the hypervisor that provisions the different VM’s running on a physical machine, while using containers the provisioning is done by the container runtime/engine. The diagram to the right offers a better understanding of this. Like VM’s, containers can be packaged up into images for quick redeployment.
With this background on containers out of the way, Docker has become the de-facto industry standard for containers in recent years with Kubernetes being a similar industry standard complimenting it for orchestration purposes (management and control of Docker containers). Kubernetes first developed by Google, is now an open source platform for automating deployment, scaling, and operations of application containers across clusters of hosts for a container-centric infrastructure. It has built in capabilities to perform self healing check, load balancing and service discovery across these clusters. Within these node clusters are VM’s that can be auto-scaled to meet requirements on the fly. The logical division of these clusters into how applications see and use them are called “pods”. Command line arguments can be issued to Kubernetes clusters using kubectl commands. The next section will focus on the specific architecture and implementation of Azure’s version of Kubernetes.
Azure’s Implementation of Kubernetes
Using a managed service for Kubernetes deployments helps take some of the complexity out of things. In terms of Azure we have exactly that in the form of the Azure Kubernetes Service (AKS). Azure takes care of the underlying work of setting up, managing and upgrading Kubernetes clusters as managed service.
A AKS cluster comprises of one control plane were management tasks are conducted from and multiple node pools were the actual workload is performed on (which are typically Linux or windows VM’s running Docker containers). The control plane node is free with AKS with you only paying for the node pools.
To elaborate further on the AKS architecture diagram on the right:
kube-apiserver - Exposes the Kubernetes API for use with kubectl commands.
etcd - Maintains state in cluster.
kube-scheduler - Intelligently selects nodes for pods/applications to be run on.
kube-controller - Oversees smaller operations such as replicating pods and managing node operations.
kubelet - Receives and invokes commands on the individual node pools from the control plane node.
kubeproxy - Provides networking features so that node pools can communicate with each other and not just the control plane node.
Other Azure services that compliment AKS would be the Azure Container Registry (ACR). This is Azure’s own version of Docker Hub, for storing and accessing public and private Docker images. They are also Azure Container Instances (ACI), which are good for running a single Docker image in a container when you don’t need the added complexity and scalability of handling multiple Docker containers in a AKS cluster.
Using Terraform as a IAC tool
One desire of doing this particular Udemy course was also to get some experience with using Terraform manifest files as an Infrastructure as Code (IAC) tool. Similar to ARM templates, Terraform is a declarative language (you specify the end result rather than each step as in an imperative language). Terraform manifest files use their own property syntax but do look quite similar to JSON. Unlike ARM templates, Terraform is cloud agnostic and can be used on all the major cloud providers for provision resources. The three basic commands of Terraform are “plan, apply and destroy”. With these commands desired state configuration and management of our provisioned resources can be achieved. Terraform has many other subcommands for a wide variety of administrative actions, but these three basic provisioning tasks are the core of Terraform.
For the course that was followed, the Terraform files will be hosted in a repo on Github before being run through an Azure DevOps build and release pipeline with their own custom YAML defintions, the details of which will follow now in the next section.
Practical Guide to Provisioning a AKS Cluster
With the theory out of the way concerning Kubernetes and Terraform, we will now proceed with documenting the steps undertaken to provision a AKS cluster. In referring to the diagram on the right, a sample AKS cluster comprising of a single control plane and three node pools (2 Linux and 1 Windows VM based) will be created. The creation of this will be handled by our Terraform manifest files built and then deployed from an Azure DevOps pipeline. An admin user will have access to the cluster (with privileges controlled by Azure AD) by way of passing kubectl commands from a command line that first passes through a Azure Load Balancer with a public IP address, which acts as a single point of access onto the VNET containing the AKS cluster. The VNET is then further subdivided into two separate subnets, with the node pools subnet allowing for up to 65536 addressable spaces.
The steps taken were as follows with numbered screenshots of each step at the bottom of this post:
Install this Terraform plugin into your Azure DevOps organisation. This plugin enables a new task in your pipelines that allow for Terraform manifest files to be used.
Created a repository on Github. Downloaded the various Terraform files provided by the course into a local folder. Then used git commands in this order init, add and then push to add the local files to the Github repo.
Created a new project in Azure DevOps. One additional step that needed to done here was to make new ARM Service connection from this project to our Azure subscription. This connection is to allow permissions for Terraform commands to be executed.
Added Azure AD permissions to Azure DevOps project. This is to allow for users with admin permissions to interact with the AKS cluster after creation. A mangled service principal was created with access to the Microsoft Graph API.
SSH public and private keys were created locally. These keys are to allow SSH access to the Linux VM’s on their node pools. A simple script to create the keys was run and then the public key was uploaded to the Azure DevOps project with the pipeline granted permissions to use it.
Created an Azure storage account with a single container. This storage container is simply to hold the Terraform state file that will be auto generated in the next step that contains essential state configuration information for the soon to be provisioned AKS infrastructure.
Created a new build pipeline in the Azure DevOps project. A YAML file was provided by the course to use. This YAML file consists of one build job with several tasks further divided down into individual steps that in order firstly validates the Terraform manifest files, before calling the Terraform plugin we installed in the first step to begin the build, and then publishing the finished artifact to a specified location. Lastly a Terraform state file is generated in the Azure storage container we previously setup.
Created a new release pipeline in the Azure DevOps project. Like the previous step, a YAML file was provided by the course to use. This YAML file separated the common Terraform tasks of init, plan and apply into separate tasks to be performed. The state file created int he last step was used in the init task while the SSH public key created in step 5 was used during the plan task.
With the AKS cluster now successfully deployed from the release pipeline we just reatred, the final step was just to verify it as such. Azure CLI commands were first used to connect tot he cluster and then simple kubectl commands were then run to verify the cluster and nodes creation was inline with what was specified initially in the Terraform manifest files.