Link to Github code repo here
Prerequisites:
- AWS Account: You must have an active AWS account. If you don't have one, you can sign up for an AWS account on the AWS website. You can create it here
- IAM User or Role: Create an IAM (Identity and Access Management) user or role in your AWS account with the necessary permissions to create and manage EKS clusters. At a minimum, the user or role should have permissions to create EKS clusters, EC2 instances, VPCs, and related resources.
- AWS CLI: Install and configure the AWS Command Line Interface (CLI) on your local machine. You'll use the AWS CLI to interact with your AWS account and configure your AWS credentials. You can download it here
- Terraform Installed: Install Terraform on your local machine. You can download Terraform from the official Terraform website and follow the installation instructions for your operating system here
What is Karpenter?
Karpenter is a project that provides automated cluster node scaling for Kubernetes. When using Fargate with Amazon EKS, the concept of traditional autoscaling groups for worker nodes does not apply since Fargate is a serverless compute engine where you don't manage the underlying nodes. However, there are still scenarios where you might want to scale the capacity of your EKS cluster, such as when you need more resources for running Fargate tasks.
Karpenter, when used with EKS and Fargate, helps automate the scaling of Fargate profiles, adjusting the capacity of Fargate tasks based on the demand in your cluster. It optimizes the number of Fargate tasks running based on the resources needed by your workloads.
Terraform code
As alway we will use our favourite terraform module from here
You can find our full terraform code in our repo We would use the code from our previous article. As example we would use Fargate nodes and configure Karpenter to add EC2 nodes to our cluster when scalling.
Providers.tf
Since we would use helm to install Karpenter and Kubectl to create Karpenter node pool we would need to declare following providers:
provider "aws" {
region = var.aws_region
profile = var.aws_profile
}
provider "kubernetes" {
host = module.eks.cluster_endpoint
cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)
exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
args = ["eks", "get-token", "--cluster-name", module.eks.cluster_name]
}
}
provider "helm" {
kubernetes {
host = module.eks.cluster_endpoint
cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)
exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
args = ["eks", "get-token", "--cluster-name", module.eks.cluster_name]
}
}
}
provider "kubectl" {
apply_retry_count = 5
host = module.eks.cluster_endpoint
cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)
load_config_file = false
exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
args = ["eks", "get-token", "--cluster-name", module.eks.cluster_name]
}
}
data "aws_availability_zones" "available" {}
data "aws_ecrpublic_authorization_token" "token" {}
Please note that AWSCli need to be installed locally
versions.tf
Specifying versions for providers is a good practice as well as helps to prevent unintended changes caused by automatically upgrading to the latest provider version. This helps in version locking and maintaining consistent behavior across deployments.
terraform {
required_version = ">= 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = ">= 4.57"
}
kubernetes = {
source = "hashicorp/kubernetes"
version = ">= 2.10"
}
helm = {
source = "hashicorp/helm"
version = ">= 2.7"
}
kubectl = {
source = "gavinbunney/kubectl"
version = ">= 1.14"
}
null = {
source = "hashicorp/null"
version = ">= 3.0"
}
}
}
variables.tf
We would update our variables.tf and start using locals to store the result of a computation that would otherwise be repeated multiple times in the module, optimizing performance and reducing redundancy:
variable "aws_profile" {
description = "Set this variable if you use another profile besides the default awscli profile called 'default'."
type = string
default = "default"
}
variable "aws_region" {
description = "Set this variable if you use another aws region."
type = string
default = "us-east-1"
}
locals {
name = "test"
cluster_version = "1.27"
region = "us-east-1"
vpc_cidr = "10.0.0.0/16"
azs = slice(data.aws_availability_zones.available.names, 0, 3)
tags = {
Example = local.name
}
}
Main.tf
In our Main.tf file we would create a cluster and vpc to prepare everything to deploy Karpenter:
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 5.0"
name = local.name
cidr = local.vpc_cidr
azs = local.azs
private_subnets = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 4, k)]
public_subnets = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 8, k + 48)]
intra_subnets = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 8, k + 52)]
enable_nat_gateway = true
single_nat_gateway = true
public_subnet_tags = {
"kubernetes.io/role/elb" = 1
}
private_subnet_tags = {
"kubernetes.io/role/internal-elb" = 1
"karpenter.sh/discovery" = local.name
}
tags = local.tags
}
module "eks" {
source = "terraform-aws-modules/eks/aws"
cluster_name = local.name
cluster_version = local.cluster_version
cluster_endpoint_public_access = true
cluster_addons = {
kube-proxy = {}
vpc-cni = {}
coredns = {
configuration_values = jsonencode({
computeType = "Fargate"
resources = {
limits = {
cpu = "0.25"
memory = "256M"
}
requests = {
cpu = "0.25"
memory = "256M"
}
}
})
}
}
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
control_plane_subnet_ids = module.vpc.intra_subnets
create_cluster_security_group = false
create_node_security_group = false
manage_aws_auth_configmap = true
aws_auth_roles = [
{
rolearn = module.karpenter.role_arn
username = "system:node:{{EC2PrivateDNSName}}"
groups = [
"system:bootstrappers",
"system:nodes",
]
},
]
fargate_profiles = {
karpenter = {
selectors = [
{ namespace = "karpenter"
labels = {
"k8s-app" = "karpenter"
}
}
]
}
kube-system = {
selectors = [
{ namespace = "kube-system" }
]
}
}
tags = merge(local.tags, {
"karpenter.sh/discovery" = local.name
})
}
Let's discuss eks module main changes and we need them:
- cluster_addons. coredns: Fargate adds 256 MB to each pod's memory reservation for the required Kubernetes components (kubelet, kube-proxy, and containerd). Fargate rounds up to the following compute configuration that most closely matches the sum of vCPU and memory requests in order to ensure pods always have the resources that they need to run. We are targeting the smallest Task size of 512Mb, so we subtract 256Mb from the request/limit to ensure we can fit within that task
- manage_aws_auth_configmap: is set to true since we need to add in the Karpenter node IAM role for nodes launched by Karpenter
- fargate_profiles: feel free to add more if needed in order to use scheduler. We would use label for Karpenter profile to ensure it's pods would be scheduled
Karpenter.tf
Here we woud define resources to deploy and configure karpenter in our cluster:
module "karpenter" {
source = "terraform-aws-modules/eks/aws//modules/karpenter"
cluster_name = module.eks.cluster_name
irsa_oidc_provider_arn = module.eks.oidc_provider_arn
enable_karpenter_instance_profile_creation = true
iam_role_additional_policies = {
AmazonSSMManagedInstanceCore = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
}
tags = local.tags
}
resource "helm_release" "karpenter" {
namespace = "karpenter"
create_namespace = true
name = "karpenter"
repository = "oci://public.ecr.aws/karpenter"
repository_username = data.aws_ecrpublic_authorization_token.token.user_name
repository_password = data.aws_ecrpublic_authorization_token.token.password
chart = "karpenter"
version = "v0.32.1"
values = [
<<-EOT
settings:
clusterName: ${module.eks.cluster_name}
clusterEndpoint: ${module.eks.cluster_endpoint}
interruptionQueueName: ${module.karpenter.queue_name}
serviceAccount:
annotations:
eks.amazonaws.com/role-arn: ${module.karpenter.irsa_arn}
controller:
resources:
requests:
cpu: 1
memory: 1Gi
limits:
cpu: 1
memory: 1Gi
podLabels:
k8s-app: karpenter
EOT
]
depends_on = [
module.eks
]
}
resource "kubectl_manifest" "karpenter_node_class" {
yaml_body = <<-YAML
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
name: default
spec:
amiFamily: AL2
role: ${module.karpenter.role_name}
subnetSelectorTerms:
- tags:
karpenter.sh/discovery: ${module.eks.cluster_name}
securityGroupSelectorTerms:
- tags:
karpenter.sh/discovery: ${module.eks.cluster_name}
tags:
karpenter.sh/discovery: ${module.eks.cluster_name}
YAML
depends_on = [
helm_release.karpenter
]
}
resource "kubectl_manifest" "karpenter_node_pool" {
yaml_body = <<-YAML
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: default
spec:
template:
spec:
nodeClassRef:
name: default
requirements:
- key: "karpenter.k8s.aws/instance-category"
operator: In
values: ["c", "m", "r"]
- key: "karpenter.k8s.aws/instance-cpu"
operator: In
values: ["4", "8", "16", "32"]
- key: "karpenter.k8s.aws/instance-hypervisor"
operator: In
values: ["nitro"]
- key: "karpenter.k8s.aws/instance-generation"
operator: Gt
values: ["2"]
limits:
cpu: 1000
disruption:
consolidationPolicy: WhenEmpty
consolidateAfter: 30s
YAML
depends_on = [
kubectl_manifest.karpenter_node_class
]
}
resource "kubectl_manifest" "nginx_deployment" {
yaml_body = <<-YAML
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 0
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
terminationGracePeriodSeconds: 0
containers:
- name: nginx
image: nginx:latest
resources:
requests:
cpu: 1
YAML
depends_on = [
helm_release.karpenter
]
}
- module "karpenter": is used to create IAM instance profile whith additional policy for the Karpenter node IAM role
- resource "helm_release" "karpenter": deploying Karpenter itself, depends on block delays this resource creation until module eks succeeded, otherwise you will get "deadline exceeded" error
- resource "kubectl_manifest" "karpenter_node_class": here we define a node class for karpenter and specify subnets in which they should be proviosioned
- resource "kubectl_manifest" "karpenter_node_pool": here is where we define requirements for instances karpenter can use in a default nodepool such as:
karpenter.k8s.aws/instance-category
: Specifies the allowed instance categories as "c" Instances (Compute-Optimized), "m" Instances (General Purpose), and "r" Instances (Memory-Optimized)karpenter.k8s.aws/instance-cpu
: Specifies allowed CPU values as "4," "8," "16," and "32."karpenter.k8s.aws/instance-hypervisor
: Specifies that the hypervisor type must be "nitro."karpenter.k8s.aws/instance-generation
: Specifies that the instance generation must be greater than "2."limits
: Sets a CPU limit of 1000 for the nodes.disruption
: Configures how Karpenter should handle node disruptions. - resource "kubectl_manifest" "nginx_deployment": here we would create sample nginx deployment with 0 replicas.
Deployment
In order to initialize terraform and download modules run:
`terraform init`
You can also check which resources terraform is planning to create by running:
terraform plan
To provision resources run:
terraform apply
Testing
After Terraform applied you should see the following output:
Apply complete! Resources: 78 added, 0 changed, 0 destroyed.
Outputs:
connect_to_eks = "aws eks --region <YOUR_REGION> update-kubeconfig --name <CLUSTER_NAME> --profile default"
endpoint = "<CLUSTER_ENDPOINT>"
Execute command from connect_to_eks
output in order to generate kubeconfig file:
aws eks --region <YOUR_REGION> update-kubeconfig --name <CLUSTER_NAME> --profile default
Verify conectivity to the cluster with kubectl:
kubectl get no
You should see list of nodes:
NAME STATUS ROLES AGE VERSION
fargate-ip-10-0-12-3.ec2.internal Ready <none> 4m10s v1.27.7-eks-4f4795d
fargate-ip-10-0-14-219.ec2.internal Ready <none> 4m15s v1.27.7-eks-4f4795d
fargate-ip-10-0-25-226.ec2.internal Ready <none> 5m46s v1.27.7-eks-4f4795d
fargate-ip-10-0-28-159.ec2.internal Ready <none> 5m45s v1.27.7-eks-4f4795d
As you can see we are using Fargate nodes. Since we have also deployment installed let's scale it to see if Karpenter will provision new node:
kubectl scale deployment nginx --replicas 2
You should see the following output:
deployment.apps/nginx scaled
After a while we can check nodes again:
kubectl get no
Output should be similar:
NAME STATUS ROLES AGE VERSION
fargate-ip-10-0-12-3.ec2.internal Ready <none> 4m10s v1.27.7-eks-4f4795d
fargate-ip-10-0-14-219.ec2.internal Ready <none> 4m15s v1.27.7-eks-4f4795d
fargate-ip-10-0-25-226.ec2.internal Ready <none> 5m46s v1.27.7-eks-4f4795d
fargate-ip-10-0-28-159.ec2.internal Ready <none> 5m45s v1.27.7-eks-4f4795d
ip-10-0-2-65.ec2.internal Ready <none> 29s v1.27.9-eks-5e0fdde
As you can see one more node is added but this time it's EC2 type node. Let check karpenter logs:
kubectl logs -f -n karpenter -l app.kubernetes.io/name=karpenter -c controller
You would see that Karpenter provisioned new node.
Clean up
Since our terraform code is not resposable for EC2 node it will not be able to delete it hence will not be able to delete VPC and related to this instance resources provisioned by Karpenter. So we would need to delete deployment so Karpenter will terminate instance for us. Run:
kubectl delete deployment nginx
You will see that deployment is deleted. Now we can run:
terraform destroy
You can find source code in our Github repo