Kubernetes Mastery

Develop and Deploy Cloud Native Applications at Scale

Horizontal Pod Autoscaler

Resources Folder Commands

  1. Mac zsh command:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml && kubectl patch deployment metrics-server -n kube-system --type='json' -p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value":"--kubelet-insecure-tls"}]'
  1. Windows Powershell command
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml && kubectl patch deployment metrics-server -n kube-system --type='json' -p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value":"--kubelet-insecure-tls"}]'
  1. Windows CMD commands
If you are using Windows CMD, you will need to enter these two commands separately. The first command installs all the necessary components for the Metrics Server:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml


The second command specifically patches the deployment manifest to add the --kubelet-insecure-tls flag to the Metrics Server configuration:

kubectl patch deployment metrics-server -n kube-system --type="json" -p="[{\"op\": \"add\", \"path\": \"/spec/template/spec/containers/0/args/-\", \"value\":\"--kubelet-insecure-tls\"}]"

Key Takeaways

The Horizontal Pod Autoscaler automatically scales the number of pods in a deployment based on observed CPU utilization (most commonly).

Key Components of an HPA Configuration

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: grade-submission-portal-hpa
  namespace: grade-submission
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: grade-submission-portal
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50

Key HPA Features:

  • Autoscaling Behavior: The HPA will increase or decrease the number of replicas to maintain the target CPU utilization.
  • Scaling Range: The number of pods will be adjusted between 1 and 10 based on the CPU utilization.
  • Resource Metrics: While this example uses CPU, HPAs can also use memory or custom metrics.
  • Target Utilization: 50% target utilization is a common starting point, but this can be adjusted based on application needs.
  • Namespace Scoping: The HPA is namespace-specific, allowing for isolated scaling policies across different parts of your application.
  • Scaling Algorithm: Kubernetes uses a control loop to periodically adjust the number of replicas based on the observed metrics.

Understanding and effectively configuring HPAs is crucial for building scalable and efficient applications in Kubernetes, ensuring optimal resource utilization and performance under varying loads.