High Availability and Autoscaling
By default, a Gateway is deployed with a single replica. However, a single replica is also a single point of failure: any node drain, eviction, or crash can take the Gateway offline.
This guide shows how to run Airlock Microgateway with multiple replicas, spread replicas across nodes, protect them during voluntary disruptions, and (optionally) scale automatically under load.
Prerequisite
Configuration
Run more than one replica
- Set the replica count in your
GatewayParameters. - Ensure your Gateway references these parameters under
spec.infrastructure.parametersRef.
Spread replicas across nodes
Add a topology spread constraint so replicas are distributed across nodes by hostname.
kubernetes.io/hostnametreats each node as a separate topology domain.maxSkew: 1allows at most a difference of one pod between any two nodes.whenUnsatisfiable: ScheduleAnywayfavors availability: if the constraint can’t be perfectly met, pods are still scheduled instead of staying pending.
Protect against disruptions
Create a PodDisruptionBudget (PDB) to ensure a minimum number of Gateway pods remain available during voluntary disruptions (e.g., node drains or cluster upgrades).
minAvailable: 1ensures at least one Gateway pod remains running throughout a disruption.- The selector targets gateway pods by the
gateway.networking.k8s.io/gateway-namelabel (value must match yourGatewayname).
Optional: Scale automatically with a HorizontalPodAutoscaler
Use a HorizontalPodAutoscaler (HPA) to scale the Gateway deployment under load.
This scales the deployment between 2 and 10 replicas, adding pods when average CPU utilization exceeds 70% and removing them when load drops. Keep minReplicas at 2 to preserve availability.
Further considerations
A highly available Gateway is only as resilient as its dependencies. If you use session handling, sessions are stored in an external Redis database configured through a RedisProvider. A single Redis instance reintroduces a single point of failure.
When session handling is configured, choose a highly available Redis mode via RedisProvider.spec.mode:
sentinel(failover for a single non-sharded master), orcluster(sharding plus failover),
rather than standalone, which has no replication and is only suitable for development.
Validation
- Verify the Gateway deployment runs with at least two replicas.
- Verify replicas are spread across nodes (not all Gateway pods on the same node).
- During a voluntary disruption (e.g., node drain), verify at least one replica stays available.
- If HPA is enabled, generate load and verify the replica count scales within the configured min/max range.
Limitations
If you configure the optional HPA step, the following limitations apply:
- If
GatewayParameters.spec.kubernetes.deployment.replicassets a replica count while an HPA is enabled, they conflict and the HPA cannot manage scaling properly because the Microgateway Operator and the HPA repeatedly overwrite each other’s desired replica count so autoscaling does not work reliably. - When you use an HPA, unset
spec.kubernetes.deployment.replicasinGatewayParameters.
- When you use an HPA, unset
- The HPA scales the Gateway horizontally by adding or removing pod replicas. This is the only scaling mode supported by Airlock Microgateway.
- In contrast, a Vertical Pod Autoscaler (VPA) scales vertically by adjusting the CPU and memory requests of individual pods. VPA-based scaling is not supported with Airlock Microgateway and may interfere with the Gateway’s resource management, resulting in unexpected pod restarts. Do not use a VPA with Airlock Microgateway.
- HPA is not supported under an Evaluation License because the Evaluation License enforces a replica restriction that is incompatible with autoscaling. Autoscaling with an HPA cannot be used while evaluating.
- While evaluating, use a fixed replica count, as described in the mandatory configuration steps.