As modern applications become complex, it becomes crucial to ensure they can handle varying workloads efficiently. With its Horizontal Pod Autoscaler (HPA), Kubernetes has been the go-to solution for many organizations to scale applications based on CPU and memory usage. However, as demands evolve, so does the need for more sophisticated autoscaling capabilities, especially when custom metrics come into play. This is where Kubernetes Event-Driven Autoscaling (KEDA) shines. In this guide, we’ll explore the limitations of HPA, the advantages of KEDA, and how to get started with KEDA to scale your Kubernetes deployments effectively.

Introduction: HPA Limitations and the Need for KEDA

The Horizontal Pod Autoscaler (HPA) is a powerful tool in Kubernetes, enabling automatic pod scaling based on CPU and memory metrics. However, in today’s cloud-native world, where microservices and event-driven architectures are prevalent, relying solely on these metrics is often insufficient. Many modern applications require scaling based on custom metrics like message queue lengths, HTTP request counts, or cloud service metrics like AWS CloudWatch alarms.

HPA’s limitations become apparent when scaling based on these non-traditional metrics. Without native support for these metrics, implementing custom solutions becomes cumbersome and error-prone. This gap in HPA’s capabilities led to the development of KEDA, a Kubernetes-native solution that seamlessly integrates with external metric sources to trigger autoscaling.

KEDA: A Kubernetes-Native Solution for Custom Metric Autoscaling

KEDA (Kubernetes Event-Driven Autoscaling) extends Kubernetes’ capabilities by enabling event-driven autoscaling based on custom metrics. It’s designed to work alongside the existing HPA, allowing you to scale pods on CPU and memory and external metrics like message queues, databases, and cloud service metrics.

KEDA introduces a new resource called ScaledObject, which defines how a Kubernetes deployment should scale based on specific metrics. These ScaledObjects can monitor various metric sources, and when a certain threshold is met, KEDA triggers the autoscaling process. This approach provides a more flexible and powerful autoscaling mechanism, allowing applications to scale dynamically based on real-world demands.

Installing KEDA: Step-by-Step Deployment Instructions

Installing KEDA on your Kubernetes cluster is straightforward. Below are the steps to get KEDA up and running:

  1. Add the KEDA Helm Repository:

    helm repo add kedacore https://kedacore.github.io/charts

helm repo update

  1. Install KEDA using Helm:

    helm install keda kedacore/keda –namespace keda –create-namespace
  1. Verify the Installation: Check that KEDA components are running:

    kubectl get pods -n ked

Once installed, KEDA is ready to manage the autoscaling of your Kubernetes deployments based on custom metrics.

Configuring IAM Roles for Secure Kubernetes Service Accounts

Security is paramount when dealing with cloud-native applications. KEDA interacts with external services, such as AWS CloudWatch, which requires proper IAM role configuration. To securely integrate KEDA with AWS, you must configure IAM roles for your Kubernetes service accounts.

  1. Create an IAM Role with Required Policies: Create an IAM role with the necessary permissions to access AWS services. Attach the role to your Kubernetes service account.
  2. Associate the IAM Role with the Service Account: Use the following annotation on your Kubernetes service account:
     

apiVersion: v1

kind: ServiceAccount

metadata:

  name: keda-service-account

  annotations:

    eks.amazonaws.com/role-arn: arn:aws:iam::<account-id>:role/<role-name>

  1. Deploy the Service Account: Apply the service account configuration to your cluster:

    kubectl apply -f service-account.yaml

This setup ensures your KEDA instance has the necessary permissions to interact with AWS securely.

Defining ScaledObjects: Triggering Autoscaling with Custom Metrics

A ScaledObject in KEDA defines the source of the metric and the criteria for scaling your Kubernetes deployment. Below is an example of how to define a ScaledObject for autoscaling based on AWS CloudWatch metrics:

apiVersion: keda.sh/v1alpha1

kind: ScaledObject

metadata:

  name: cloudwatch-scaledobject

  namespace: default

spec:

  scaleTargetRef:

    kind: Deployment

    name: my-deployment

  minReplicaCount: 1

  maxReplicaCount: 10

  triggers:

  – type: aws-cloudwatch

    metadata:

      namespace: “AWS/Lambda”

      metricName: “Invocations”

      targetValue: “100”

      awsRegion: “us-west-2”

This configuration tells KEDA to monitor AWS CloudWatch for Lambda invocation metrics and scale the deployment between 1 and 10 replicas based on the number of invocations.

Example ScaledObject Configuration: Autoscaling with AWS CloudWatch

Let’s explore a practical example of using KEDA with AWS CloudWatch. Suppose you have a microservice that processes tasks from an SQS queue. You want to scale the service based on the queue’s length.

  1. IAM Role Configuration: Ensure the IAM role attached to KEDA has permission to access the SQS and CloudWatch services.
  2. ScaledObject Definition: Here’s an example of ScaledObject for scaling based on SQS queue length:

    apiVersion: keda.sh/v1alpha1

kind: ScaledObject

metadata:

  name: sqs-queue-scaledobject

  namespace: default

spec:

  scaleTargetRef:

    kind: Deployment

    name: sqs-consumer

  minReplicaCount: 1

  maxReplicaCount: 5

  triggers:

  – type: aws-cloudwatch

    metadata:

      namespace: “AWS/SQS”

      metricName: “ApproximateNumberOfMessagesVisible”

      targetValue: “10”

      awsRegion: “us-east-1”

      queueUrl: “https://sqs.us-east-1.amazonaws.com/<account-id>/<queue-name>”

This setup will automatically scale the number of sqs-consumer pods based on the visible messages in the SQS queue.

Testing KEDA Autoscaling: Generating Load and Monitoring Results

Testing the autoscaling functionality is essential to ensure your configuration works as expected. Follow these steps to simulate load and observe KEDA in action:

  1. Generate Load: Push many messages to the SQS queue or simulate many Lambda invocations to trigger the scaling process.
  2. Monitor Autoscaling: Use Kubernetes tools like kubectl to monitor the scaling behavior:

    kubectl get pods -w
  1. Check CloudWatch Metrics: Verify that the CloudWatch metrics align with the scaling behavior observed in Kubernetes.

This testing phase will help you confirm that KEDA effectively manages your Kubernetes deployments based on custom metrics.

Conclusion

KEDA is a powerful tool that extends Kubernetes autoscaling capabilities beyond CPU and memory. It allows you to scale deployments based on custom metrics from various sources. Integrating KEDA into your Kubernetes environment can achieve more responsive, efficient, and cost-effective scaling for cloud-native applications.

References

Autoscaling Kubernetes workloads with KEDA using Amazon Managed Service for Prometheus metrics

Proactive autoscaling of Kubernetes workloads with KEDA and Amazon CloudWatch