High Availability and Autoscaling

By default, a Gateway is deployed with a single replica. However, a single replica is also a single point of failure: any node drain, eviction, or crash can take the Gateway offline.

This guide shows how to run Airlock Microgateway with multiple replicas, spread replicas across nodes, protect them during voluntary disruptions, and (optionally) scale automatically under load.

Prerequisite

A Gateway Deployment.

Configuration

Run more than one replica

Set the replica count in your GatewayParameters.

Example

apiVersion: microgateway.airlock.com/v1alpha1
kind: GatewayParameters
metadata:
  name: <your-gateway-parameters-name>
  namespace: <your-gateway-namespace>
spec:
  kubernetes:
    deployment:
      replicas: 2

Ensure your Gateway references these parameters under spec.infrastructure.parametersRef.

Example

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: <your-gateway-name>
  namespace: <your-gateway-namespace>
spec:
  infrastructure:
    parametersRef:
      group: microgateway.airlock.com
      kind: GatewayParameters
      name: <your-gateway-parameters-name>

Spread replicas across nodes

Add a topology spread constraint so replicas are distributed across nodes by hostname.

Example

spec:
  kubernetes:
    deployment:
      replicas: 2
      placement:
        topologySpreadConstraints:
          - maxSkew: 1
            topologyKey: kubernetes.io/hostname
            whenUnsatisfiable: ScheduleAnyway

kubernetes.io/hostname treats each node as a separate topology domain.
maxSkew: 1 allows at most a difference of one pod between any two nodes.
whenUnsatisfiable: ScheduleAnyway favors availability: if the constraint can’t be perfectly met, pods are still scheduled instead of staying pending.

Protect against disruptions

Create a PodDisruptionBudget (PDB) to ensure a minimum number of Gateway pods remain available during voluntary disruptions (e.g., node drains or cluster upgrades).

Example

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: <your-gateway-pdb-name>
  namespace: <your-gateway-namespace>
spec:
  minAvailable: 1
  selector:
    matchLabels:
      gateway.networking.k8s.io/gateway-name: <your-gateway-name>

minAvailable: 1 ensures at least one Gateway pod remains running throughout a disruption.
The selector targets gateway pods by the gateway.networking.k8s.io/gateway-name label (value must match your Gateway name).

Optional: Scale automatically with a HorizontalPodAutoscaler

Use a HorizontalPodAutoscaler (HPA) to scale the Gateway deployment under load.

Example

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: <your-gateway-hpa-name>
  namespace: <your-gateway-namespace>
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: <your-gateway-deployment-name>
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

This scales the deployment between 2 and 10 replicas, adding pods when average CPU utilization exceeds 70% and removing them when load drops. Keep minReplicas at 2 to preserve availability.

Further considerations

A highly available Gateway is only as resilient as its dependencies. If you use session handling, sessions are stored in an external Redis database configured through a RedisProvider. A single Redis instance reintroduces a single point of failure.

When session handling is configured, choose a highly available Redis mode via RedisProvider.spec.mode:

sentinel (failover for a single non-sharded master), or
cluster (sharding plus failover),

rather than standalone, which has no replication and is only suitable for development.

Validation

Verify the Gateway deployment runs with at least two replicas.
Verify replicas are spread across nodes (not all Gateway pods on the same node).
During a voluntary disruption (e.g., node drain), verify at least one replica stays available.
If HPA is enabled, generate load and verify the replica count scales within the configured min/max range.

Limitations

If you configure the optional HPA step, the following limitations apply:

If GatewayParameters.spec.kubernetes.deployment.replicas sets a replica count while an HPA is enabled, they conflict and the HPA cannot manage scaling properly because the Microgateway Operator and the HPA repeatedly overwrite each other’s desired replica count so autoscaling does not work reliably.
- When you use an HPA, unset spec.kubernetes.deployment.replicas in GatewayParameters.
The HPA scales the Gateway horizontally by adding or removing pod replicas. This is the only scaling mode supported by Airlock Microgateway.
- In contrast, a Vertical Pod Autoscaler (VPA) scales vertically by adjusting the CPU and memory requests of individual pods. VPA-based scaling is not supported with Airlock Microgateway and may interfere with the Gateway’s resource management, resulting in unexpected pod restarts. Do not use a VPA with Airlock Microgateway.
HPA is not supported under an Evaluation License because the Evaluation License enforces a replica restriction that is incompatible with autoscaling. Autoscaling with an HPA cannot be used while evaluating.
- While evaluating, use a fixed replica count, as described in the mandatory configuration steps.