Creating Private EKS Cluster with Bastion Host using Terraform

October 12, 2023

Creating a private Amazon Elastic Kubernetes Service (EKS) cluster with a bastion host is an exceptionally secure way to manage and interact with your Kubernetes cluster's API. In this article, we'll explore how to set up this robust infrastructure, ensuring that your EKS environment remains isolated from the public internet while maintaining convenient, controlled access through the bastion host.

Link to Github code repo here

Prerequisites:

AWS Account: You must have an active AWS account. If you don't have one, you can sign up for an AWS account on the AWS website. You can create it here
IAM User or Role: Create an IAM (Identity and Access Management) user or role in your AWS account with the necessary permissions to create and manage EKS clusters. At a minimum, the user or role should have permissions to create EKS clusters, EC2 instances, VPCs, and related resources.
AWS CLI: Install and configure the AWS Command Line Interface (CLI) on your local machine. You'll use the AWS CLI to interact with your AWS account and configure your AWS credentials. You can download it here
Terraform Installed: Install Terraform on your local machine. You can download Terraform from the official Terraform website and follow the installation instructions for your operating system here

Creating Bastion Host with Terraform

First as always let's create our provider.tf file in our project folder:

provider "aws" {
  region  = var.aws_region
  profile = var.aws_profile
}

data "aws_eks_cluster" "this" {
  name = local.cluster_name
  depends_on = [
    module.eks.eks_managed_node_groups,
  ]
}

data "aws_eks_cluster_auth" "this" {
  name = local.cluster_name
  depends_on = [
    module.eks.eks_managed_node_groups,
  ]
}

provider "kubernetes" {
  host                   = data.aws_eks_cluster.this.endpoint
  token                  = data.aws_eks_cluster_auth.this.token
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.this.certificate_authority.0.data)
}

VPC and Security Group

Here we create a VPC using a module as it was shown in this article.

Then we would create main.tf file. First we would create vpc and custom security group for our resources:

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "5.1.2"

  name = var.vpc_name

  cidr = var.cidr
  azs  = var.aws_availability_zones

  private_subnets = var.private_subnets
  public_subnets  = var.public_subnets

  enable_nat_gateway   = true
  single_nat_gateway   = true
  enable_dns_hostnames = true

    public_subnet_tags = {
    "kubernetes.io/cluster/${local.cluster_name}" = "shared"
    "kubernetes.io/role/elb"                      = 1
  }

  private_subnet_tags = {
    "kubernetes.io/cluster/${local.cluster_name}" = "shared"
    "kubernetes.io/role/internal-elb"             = 1
  }
}

  resource "aws_security_group" "ssh" {
  name        = "allow_ssh"
  description = "Allow ssh inbound traffic"
  vpc_id      = module.vpc.vpc_id

  ingress {
    description      = "TLS from VPC"
    from_port        = 22
    to_port          = 22
    protocol         = "tcp"
    cidr_blocks      = ["0.0.0.0/0"]
  }

  egress {
    from_port        = 0
    to_port          = 0
    protocol         = "-1"
    cidr_blocks      = ["0.0.0.0/0"]
  }
}

Our security group allows ingress traffic only on port 22, while allowing all egress traffic

EC2 Bastion Instance

We would create SSH Key for our Instance and EC2 instance to use as a Bastion Host Then we can add the following:

resource "tls_private_key" "ssh_key" {
  algorithm = "RSA"
  rsa_bits  = 4096
}

resource "local_file" "private_key" {
    content  = tls_private_key.ssh_key.private_key_pem
    filename = "${var.path}/${var.private_key_name}"
    file_permission = "0600"
}

resource "aws_key_pair" "public_key" {
  key_name   = "public_bastion_key"
  public_key = tls_private_key.ssh_key.public_key_openssh
}

resource "aws_iam_instance_profile" "this" {
    name = "instance_profile"
    role = aws_iam_role.role.id 
    
}

module "ec2-instance" {
  source  = "terraform-aws-modules/ec2-instance/aws"
  version = "5.5.0"

  name = "Bastion-instance"
  instance_type = "t3.small"
  subnet_id = module.vpc.public_subnets[0]
  associate_public_ip_address = true
  user_data = <<EOF
  #!/bin/bash

  # Install AWS CLI v2
  curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
  unzip awscliv2.zip
  sudo ./aws/install
  aws --version

  # Install Helm v3
  curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3
  chmod 700 get_helm.sh
  ./get_helm.sh
  helm version

  # Install kubectl
  curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
  sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
  kubectl version --client

  # Install aws-iam-authenticator
  curl -o aws-iam-authenticator https://amazon-eks.s3.us-west-2.amazonaws.com/1.21.2/2021-07-05/bin/linux/amd64/aws-iam-authenticator
  chmod +x ./aws-iam-authenticator
  mkdir -p $HOME/bin && cp ./aws-iam-authenticator $HOME/bin/aws-iam-authenticator && export PATH=$PATH:$HOME/bin
  echo 'export PATH=$PATH:$HOME/bin' >> ~/.bashrc


  # Update Kubectl
  aws eks update-kubeconfig --name ${local.cluster_name}"
EOF
  key_name = aws_key_pair.public_key.key_name
  vpc_security_group_ids = [aws_security_group.ssh.id]
  iam_instance_profile = aws_iam_instance_profile.this.id

  depends_on = [ aws_iam_instance_profile.this,aws_iam_role.role ]
}

resources tls_private_key, local_file and aws_key_pair are used to create SSH key for instance. More info you can find here
module "ec2-instance": as always we would use a module, you can find documentaions here. Let's clear out which arguments do we use, and why, in order to create our Bastion Host:
- source and version is used to declare which repo will be used to download module, and specifies which version
- name - Declares a name for our instance
- instance_type - ec2 instance type to create
- subnet_id - specifies which subnet our instance should be placed
- associate_public_ip_address - since we would like to ssh to it, we would need public ip
- user_data - this argument specifies our custom script, which would run on the instance right after it is provisioned
- key_name - providing ssh key which we created to a machine
- vpc_security_group_ids - specifies security groups for EC2 instanse
- iam_instance_profile - we would create Instance profile
- depends_on - delay module creation till dependencies installed

Using an instance profile for a bastion instance when connecting to an EKS cluster enhances security, access control, and compliance while minimizing risks associated with broad permissions

IAM Role for Instance Profile

resource "aws_iam_role" "role" {
  name               = "testrole"
  path               = "/"
  assume_role_policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": "sts:AssumeRole",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Effect": "Allow",
      "Sid": ""
    },
     {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "AWS": "*"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
EOF
}

resource "aws_iam_role_policy" "this" {
  name = "web_iam_role_policy"
  role = "${aws_iam_role.role.id}"
  policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["eks:*"],
      "Resource": ["*"]
    },
    {
      "Effect": "Allow",
      "Action": ["s3:*"],
      "Resource": ["*"]
    }
  ]
}
EOF
}

This part creates an AWS Identity and Access Management (IAM) role with associated policies for an instance profile. Here's a short description of what it does:

IAM Role Creation: This code defines an IAM testrole and specifies the role's path in the IAM hierarchy as the root ("/"). The role's assume role policy allows AWS EC2 instances to assume this role
Assume Role Policy: The assume role policy grants EC2 instances the permission to assume the role. It specifies that the service "ec2.amazonaws.com" is allowed to assume the role using the "sts:AssumeRole" action. Additionally, it allows any AWS entity ("AWS": "*") to assume the role.
IAM Role Policy: This code also attaches a policy named "web_iam_role_policy" to the IAM role. The policy grants permissions to perform actions related to Amazon EKS (Elastic Kubernetes Service) using the "eks:" actions. It also grants broad permissions to interact with Amazon S3 using "s3:" actions. The resource "Resource": ["*"] allows these actions to be performed on any resource.

Eks Module

In this article we would simplify our cluster. For addional details please check our article on Creating EKS using Terraform

Add the following content to your main.tf file:

locals {
  cluster_name = "${var.env}-eks-${random_string.suffix.result}"
}

resource "random_string" "suffix" {
  length  = 8
  special = false
}

data "external" "current_ipv4" {
  program = ["bash","-c","curl ipinfo.io"]
}

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "19.16.0"

  cluster_name    = local.cluster_name
  cluster_version = var.cluster_version

  vpc_id                         = module.vpc.vpc_id
  subnet_ids                     = module.vpc.private_subnets
  cluster_endpoint_public_access = true
  cluster_endpoint_private_access = true
  cluster_endpoint_public_access_cidrs = ["${module.ec2-instance.public_ip}/32", "${data.external.current_ipv4.result["ip"]}/32"]

  eks_managed_node_group_defaults = {
    ami_type = var.ami_type

  }

  manage_aws_auth_configmap = true
  aws_auth_roles = [
    {
      rolearn  = aws_iam_role.role.arn
      username = "testrole"
      groups = [
        "system:bootstrappers",
        "system:nodes",
        "system:masters"
      ]
    }
  ]
}

data "external" "current_ipv4": this Terraform data source is designed to execute an external program to fetch the public IPv4 address of the system running Terraform
cluster_endpoint_public_access_cidrs: allows public access to our cluster only from specified cidrs, in our case we allow our bastion host and the system running Terraform
manage_aws_auth_configmap: The ConfigMap is a Kubernetes resource that defines how AWS IAM roles map to Kubernetes users and groups, allowing you to control access and authorization in your cluster based on IAM roles. aws_auth_roles: is configuring the AWS authentication for Kubernetes, allowing our testrole to access the EKS cluster and assigning it to specific Kubernetes groups.

Now add this to your eks module block:

  # Extend node-to-node security group rules
  node_security_group_additional_rules = {
    ingress_self_all = {
      description = "Node to node all ports/protocols"
      protocol    = "-1"
      from_port   = 0
      to_port     = 0
      type        = "ingress"
      self        = true
    }
    egress_all = {
      description      = "Node all egress"
      protocol         = "-1"
      from_port        = 0
      to_port          = 0
      type             = "egress"
      cidr_blocks      = ["0.0.0.0/0"]
    }
    ## Enable access from bastion host to Nodes
    ingress_bastion = {
      description       = "Allow access from Bastion Host"
      type              = "ingress"
      from_port         = 443
      to_port           = 443
      protocol          = "tcp"
      source_security_group_id = aws_security_group.ssh.id
}
  }
## Enable access from bastion host to EKS endpoint
    cluster_security_group_additional_rules = {
        ingress_bastion = {
          description       = "Allow access from Bastion Host"
          type              = "ingress"
          from_port         = 443
          to_port           = 443
          protocol          = "tcp"
          source_security_group_id = aws_security_group.ssh.id
    }
      }

  eks_managed_node_groups = {
    on_demand_1 = {
      min_size     = 1
      max_size     = 3
      desired_size = 1

      instance_types = ["t3.small"]
      capacity_type = "ON_DEMAND"
    }
  }

We are specifying some additional security group rules as well as an EKS Managed Node Group for our cluster.

Variables and Outputs:

Since we are using variables in our code, don't forget to declare them. Create variables.tf file:

variable "aws_profile" {
  description = "Set this variable if you use another profile besides the default awscli profile called 'default'."
  type        = string
  default     = "default"
}

variable "aws_region" {
  description = "Set this variable if you use another aws region."
  type        = string
  default     = "us-east-1"
}

variable "vpc_name" {
  description = "Vpc name that would be created for your cluster"
  type        = string
  default     = "EKS_vpc"
}

variable "aws_availability_zones" {
  description = "AWS availability zones"
  default     = ["us-east-1a", "us-east-1b", "us-east-1c"]
}

variable "cidr" {
  description = "Cird block for your VPC"
  type        = string
  default     = "10.0.0.0/16"
}

variable "env" {
  description = "it would be a prefix for you cluster name created, typically specified as dev or test"
  type        = string
  default     = "dev"
}

variable "private_subnets" {
  description = "private subnets to create, need to have 1 for each AZ"
  default     = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
}

variable "public_subnets" {
  description = "public subnets to create, need to have 1 for each AZ"
  default     = ["10.0.4.0/24", "10.0.5.0/24", "10.0.6.0/24"]
}

variable "cluster_version" {
  description = "kubernetes cluster version"
  type        = string
  default     = "1.27"
}

variable "ami_type" {
  description = "Ami Type for Ec2 instances created in Cluster"
  type        = string
  default     = "AL2_x86_64"
}


variable "private_key_name" {
  description = "Full path to you ssh folder"
  type        = string
  default     = "bastion_key.pem"
}


variable "path" {
  description = "Full path to you ssh folder"
  type        = string
  default     = "ssh"
}

Pay attention to variable path, change this value for where your ssh keys are stored

Let's create outputs.tf file, so needed values will be printed at the end of terraform apply:

output "Connect_to_instance" {
  description = "The public IP address assigned to the instance"
  value       = "ssh -i ${var.path}/${var.private_key_name} ec2-user@${module.ec2-instance.public_ip}"
}

output "cluster_name" {
  description = "Name of the Cluster created"
  value       = module.eks.cluster_name
}

output "EC2_public_ip" {
  description = "The public IP address assigned to the instance"
  value       = module.ec2-instance.public_ip
}

output "current_ipv4_json" {
  value = data.external.current_ipv4.result["ip"]
}

Connect_to_instance: provides you a command to connect to your Bastion Host
cluster_name - Name of the Cluster created
EC2_public_ip - Public Ip of Bastion Host created
current_ipv4_json - Ip of the system running terraform

You can also print them with command when your resources are provisioned:

terraform output

Deploying and veryfying

When our code is ready we can initialize it, run:

terraform init

Then we can run

terraform apply

After apply is done, you will see that terraform created a file bastion_key.pem in our path folder.

Than we can connect to our Bastion Host using a command provided by output:

ssh -i ~/.ssh/bastion_key.pem ec2-user@<YOUR_INSTANCE_PUBLIC_IP>

Since we where running userdata script we have everything installed and configured to connect to our Cluster. After connection to Bastion Host, we can verify connectivity between Bastion Host and EKS Cluster. Run:

kubectl get no

You will see nodes running in your newly created cluster!

You can find entire code here