High Availability and Autoscaling

By default, a Gateway is deployed with a single replica. However, a single replica is also a single point of failure: any node drain, eviction, or crash can take the Gateway offline.

This guide shows how to run Airlock Microgateway with multiple replicas, spread replicas across nodes, protect them during voluntary disruptions, and (optionally) scale automatically under load.

Prerequisite

Configuration

Run more than one replica

  1. Set the replica count in your GatewayParameters.
  2.  
    Example
    apiVersion: microgateway.airlock.com/v1alpha1
    kind: GatewayParameters
    metadata:
      name: <your-gateway-parameters-name>
      namespace: <your-gateway-namespace>
    spec:
      kubernetes:
        deployment:
          replicas: 2
  3. Ensure your Gateway references these parameters under spec.infrastructure.parametersRef.
  4.  
    Example
    apiVersion: gateway.networking.k8s.io/v1
    kind: Gateway
    metadata:
      name: <your-gateway-name>
      namespace: <your-gateway-namespace>
    spec:
      infrastructure:
        parametersRef:
          group: microgateway.airlock.com
          kind: GatewayParameters
          name: <your-gateway-parameters-name>

Spread replicas across nodes

Add a topology spread constraint so replicas are distributed across nodes by hostname.

 
Example
spec:
  kubernetes:
    deployment:
      replicas: 2
      placement:
        topologySpreadConstraints:
          - maxSkew: 1
            topologyKey: kubernetes.io/hostname
            whenUnsatisfiable: ScheduleAnyway
  • kubernetes.io/hostname treats each node as a separate topology domain.
  • maxSkew: 1 allows at most a difference of one pod between any two nodes.
  • whenUnsatisfiable: ScheduleAnyway favors availability: if the constraint can’t be perfectly met, pods are still scheduled instead of staying pending.

Protect against disruptions

Create a PodDisruptionBudget (PDB) to ensure a minimum number of Gateway pods remain available during voluntary disruptions (e.g., node drains or cluster upgrades).

 
Example
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: <your-gateway-pdb-name>
  namespace: <your-gateway-namespace>
spec:
  minAvailable: 1
  selector:
    matchLabels:
      gateway.networking.k8s.io/gateway-name: <your-gateway-name>
  • minAvailable: 1 ensures at least one Gateway pod remains running throughout a disruption.
  • The selector targets gateway pods by the gateway.networking.k8s.io/gateway-name label (value must match your Gateway name).

Optional: Scale automatically with a HorizontalPodAutoscaler

Use a HorizontalPodAutoscaler (HPA) to scale the Gateway deployment under load.

 
Example
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: <your-gateway-hpa-name>
  namespace: <your-gateway-namespace>
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: <your-gateway-deployment-name>
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

This scales the deployment between 2 and 10 replicas, adding pods when average CPU utilization exceeds 70% and removing them when load drops. Keep minReplicas at 2 to preserve availability.

Further considerations

A highly available Gateway is only as resilient as its dependencies. If you use session handling, sessions are stored in an external Redis database configured through a RedisProvider. A single Redis instance reintroduces a single point of failure.

When session handling is configured, choose a highly available Redis mode via RedisProvider.spec.mode:

  • sentinel (failover for a single non-sharded master), or
  • cluster (sharding plus failover),

rather than standalone, which has no replication and is only suitable for development.

Validation

  1. Verify the Gateway deployment runs with at least two replicas.
  2. Verify replicas are spread across nodes (not all Gateway pods on the same node).
  3. During a voluntary disruption (e.g., node drain), verify at least one replica stays available.
  4. If HPA is enabled, generate load and verify the replica count scales within the configured min/max range.

Limitations

If you configure the optional HPA step, the following limitations apply:

  • If GatewayParameters.spec.kubernetes.deployment.replicas sets a replica count while an HPA is enabled, they conflict and the HPA cannot manage scaling properly because the Microgateway Operator and the HPA repeatedly overwrite each other’s desired replica count so autoscaling does not work reliably.
    • When you use an HPA, unset spec.kubernetes.deployment.replicas in GatewayParameters.
  • The HPA scales the Gateway horizontally by adding or removing pod replicas. This is the only scaling mode supported by Airlock Microgateway.
    • In contrast, a Vertical Pod Autoscaler (VPA) scales vertically by adjusting the CPU and memory requests of individual pods. VPA-based scaling is not supported with Airlock Microgateway and may interfere with the Gateway’s resource management, resulting in unexpected pod restarts. Do not use a VPA with Airlock Microgateway.
  • HPA is not supported under an Evaluation License because the Evaluation License enforces a replica restriction that is incompatible with autoscaling. Autoscaling with an HPA cannot be used while evaluating.
    • While evaluating, use a fixed replica count, as described in the mandatory configuration steps.

CR reference documentation