Scaling to Zero with EKS Cluster Autoscaler: Cost Optimization for Self-Managed Nodegroups

Introduction to EKS Cluster Autoscaling

Amazon Elastic Kubernetes Service (EKS) is a powerful tool for managing Kubernetes clusters in the cloud. One of its standout features is Cluster Autoscaler, which dynamically adjusts the number of nodes in a cluster based on the resource demands of your workloads. For organizations aiming to minimize costs, particularly with intermittent workloads, configuring Cluster Autoscaler to scale self-managed nodegroups to zero is a game-changer.

In this guide, we’ll explore how to leverage EKS Cluster Autoscaler for self-managed node groups, enabling zero-node scaling to save costs when workloads are idle.

Understanding Self-Managed Nodegroups

Unlike managed nodegroups, self-managed nodegroups offer greater control over instance configurations, operating systems, and scaling policies. However, this flexibility comes with the responsibility of configuring and maintaining the underlying Auto Scaling Groups (ASGs).

Critical Differences Between Managed and Self-Managed Nodegroups:

Control: Self-managed nodegroups allow custom AMIs and configurations.
Scaling Behavior: You must explicitly configure the ASGs and permissions for scaling to zero.
Cost Optimization: Scaling self-managed nodegroups to zero ensures no compute costs during idle periods.

Why Scale to Zero?

Scaling to zero means shutting down all nodes in an ASG when no workloads require resources. This strategy:

Eliminates costs for idle compute resources.
Optimizes cloud expenditures for dynamic and bursty workloads.
Enables better resource utilization for production and development environments.

Setting Up the Environment

Prerequisites

AWS CLI and eksctl installed and configured.
An AWS account with necessary IAM permissions.
kubectl was installed for cluster management.

Creating the EKS Cluster with Managed and Self-Managed Nodegroups

Use eksctl to create a cluster with both managed and self-managed nodegroups:

eksctl create cluster \

–name zero-scaling-cluster \

–version 1.27 \

–region us-west-2 \

–nodegroup-name managed-ng \

–nodes 2 \

–nodes-min 1 \

–nodes-max 3 \

–node-type t3.medium \

–asg-access \

–nodegroup-name self-managed-ng \

–nodes 0 \

–nodes-min 0 \

–nodes-max 5 \

–node-type t3.micro

Tagging for Auto Scaling Group Discovery

Cluster Autoscaler relies on specific tags to identify ASGs for scaling. Add the following tags to your self-managed nodegroup ASG:

Key: k8s.io/cluster-autoscaler/enabled
Key: k8s.io/cluster-autoscaler/<CLUSTER_NAME>
- Value: owned

Replace <CLUSTER_NAME> with your cluster’s name.

aws autoscaling create-or-update-tags –tags \

Key=k8s.io/cluster-autoscaler/enabled,Value=true,PropagateAtLaunch=true,ResourceId=<ASG_NAME> \

Key=k8s.io/cluster-autoscaler/<CLUSTER_NAME>,Value=owned,PropagateAtLaunch=true,ResourceId=<ASG_NAME>

Configuring Permissions for Cluster Autoscaler

The IAM role assigned to the EKS worker nodes must include permissions for scaling actions. Attach the following policy to the worker node role:

{

“Version”: “2012-10-17”,

“Statement”: [

{

“Effect”: “Allow”,

“Action”: [

“autoscaling:DescribeAutoScalingGroups”,

“autoscaling:DescribeAutoScalingInstances”,

“autoscaling:SetDesiredCapacity”,

“autoscaling:TerminateInstanceInAutoScalingGroup”

“Resource”: “*”

}

]

}

Deploying Cluster Autoscaler

Install Cluster Autoscaler via Helm:
helm repo add autoscaler https://kubernetes.github.io/autoscaler

helm repo update

helm install cluster-autoscaler autoscaler/cluster-autoscaler \

–namespace kube-system \

–set cloudProvider=aws \

–set autoDiscovery.clusterName=zero-scaling-cluster

Edit the deployment to enable the –scale-down-enabled=true and –balance-similar-node-groups flags.

Testing Scaling to and from Zero

Test Scenarios:

Idle Cluster:
- Deploy a workload that requires minimal resources.
- Confirm the ASG scales down to zero after workloads are complete.
Dynamic Scaling:
- Deploy a workload that exceeds the current node capacity.
- Verify the ASG scales up and schedules the pods.

Monitor scaling behavior using the Kubernetes dashboard or kubectl commands:

kubectl get nodes

kubectl describe pod <POD_NAME>

Cleanup and Conclusion

Cleanup

Delete the deployed Cluster Autoscaler:
helm uninstall cluster-autoscaler –namespace kube-system
Delete the EKS cluster:
eksctl delete cluster –name zero-scaling-cluster

Conclusion

Configuring EKS Cluster Autoscaler for self-managed node groups allows you to achieve cost savings without compromising resource availability. This approach is especially beneficial for dynamic workloads, enabling you to scale efficiently and minimize idle resource costs.