Performance tuning and scaling best practices
There are various aspects that influence the performance and ability to scale an Airlock IAM system depending on the precise usage scenario. This article highlights the most common bottlenecks.
Audit log
In Airlock IAM audit log signing is disabled by default. The signature ensures that manipulations of log messages are detectable. The generation of log sequence numbers and digital signatures limits scaling with the number of CPU cores. Disabling audit log signing can speed up the whole system significantly. Audit log messages are still written - but not signed anymore.
To change audit log signing:
- Open the instance configuration (e.g. instances/auth/instance.properties) in an editor.
- To disable set the property “iam.audit-log.signing.enabled” to
false. - To enable set the property “iam.audit-log.signing.enabled” to
true. - Restart the IAM instance to activate the changes.
Database selection
The Data sources (databases, directories) section lists the supported databases.
Choose your database according to your requirements:
- Some of them are known to scale well with the number of users or tokens.
- Others, like the H2 database, are fine for smaller scenarios only.
The supported H2 database has limitations, therefore we recommend using it only for testing purposes and for small production scenarios with some hundreds of users.
Limitations of the H2 database:
- H2 uses only one index per logical table at a time in each query.
- The H2 cluster feature imposes additional limitations on performance (result sets are completely loaded into memory).
Database monitoring
The chosen database, as well as the database host, have to be monitored to detect unexpected increasing system load, a high proportion of I/O wait, or inefficient queries due to missing indexes.
Database indices
To scale well, a database needs indices that are based on the most frequent use cases. Since Airlock IAM is highly configurable regarding database fields, the indices have to be set with the knowledge of the field configuration and usage.
IAM defines indices and primary keys in the database schema setup scripts. The database schemas are not updated automatically when updating the Airlock IAM software. Changes to the database schemas have to be performed manually.
The schema files and the schema update files per database can be found here:
Data sources (databases, directories)
Database statistics
DB systems use optimizers to tune queries on their own. The decisions of the query optimizer are based on database usage statistics. The statistics have to be up-to-date to result in the best decisions.
We strongly recommend configuring the DB system to create new statistics regularly.
Database connection pool
Connections from Airlock IAM to the DB are handled by connection pools.
The idea is to run only as many parallel requests in the database as it can handle at a time:
- Using too few DB connections will result in a bottleneck (wait events in IAM).
- Using too many DB connections can result in DB wait events and lower the overall performance as well.
Each configured database connection has its own pool size. A separate connection pool is used by each module (Loginapp, Adminapp, SOAP Interface, Service Container, Transaction Approval) and connection pool configuration (e.g. pool size). The Loginapp typically needs more database connections than the Adminapp.
As a rule of thumb, the developers of the “Hikari” connection pool recommend limiting the number of connections to twice the number of CPU cores of the database server plus the number of disk spindles.
See https://github.com/brettwooldridge/HikariCP/wiki/About-Pool-Sizing for more information about connection pool sizing.
Current JDBC driver
Database vendors provide JDBC drivers for specific versions of their databases. JDBC drivers are not bundled with Airlock IAM and must be downloaded and deployed separately. It is essential to use the latest JDBC driver that matches the database version currently in use. Unexpected behavior including degraded performance can result from an inappropriate driver version.
Persistency plugins
The choice of persistency plugins has an influence on the performance:
- Whenever possible favor using User Store plugins over User Persister plugins. This is especially true for the user management in the Adminapp.
- If combining multiple persistence systems (e.g. the IAM database and an external system via a custom plugin), minimize calls to potentially slow external systems. Typically, not all calls in all plugins need to involve external persistence systems.
Hash algorithms
Some hash algorithms, such as scrypt, are expensive by design in order to make brute-force attacks on hash databases unfeasible, i.e., to increase the security of the hashed data. On the other hand, choosing an expensive hash algorithm limits the number of requests that a system is able to process per time unit. Other hash algorithms, such as SHA-256 are pretty fast and therefore easier to brute-force.
Airlock IAM uses hashes for various use cases. Depending on the use case a different hash algorithm is suitable.
We recommend:
- Argon2id or scrypt for passwords, secret questions, and IAK letters.
- SHA-256 for OAuth, Open ID Connect, and matrix cards.
See Selection and parametrization of hash functions for further information and tuning options.
Keep your IAM release version up to date
New Airlock IAM versions are often bringing additional performance optimizations. We recommend checking for newer versions and updating/upgrading on a regular basis.
Adjust file descriptor limits in systemd service files
For Docker deployments, this information does not apply. Please refer to the Docker documentation or the documentation of your orchestration software (Kubernetes, OpenShift, etc.).
Under rare circumstances, such as during SAML performance and integration testing, a large peak number of requests can generate exception errors (Too many open files) in the IAM log files. In this case, the currently configured limits for file descriptors have been exceeded and can be adjusted for the corresponding service.
Airlock IAM uses systemd to start and stop IAM and IAM services, see also Starting and stopping IAM. Because of this, file descriptor limits for services have to be configured for systemd and not by changing ulimit values.
The IAM CLI allows adjusting the file descriptor limit per service using the option -l or --limit-file-descriptors as in the following example:
iam systemd -i auth -l 8192
This creates a systemd service file with a file descriptor limit of 8192 for the auth service.
Additional considerations on expected throughput and scaling
- If the recommended configuration optimizations above are applied - such as disabling audit-log signing, creating the suggested database indices, and sizing the database connection pool appropriately - and if password hashing uses Argon2id or scrypt, one password check typically takes approximately 200 ms. This corresponds to an expected throughput of about 5 password checks per CPU core per second.
- Airlock IAM scales well with additional physical CPU cores because password-hashing operations can run in parallel. However, adding more logical CPU threads (hyper-threading) typically provides little or no throughput improvement, since password-hashing algorithms are compute- and sometimes memory-bound. Hyper-threading is mainly beneficial when CPU cores spend significant time waiting on I/O operations, such as database queries, disk access, network communication, or external services, which is not the case during password hashing.
- To achieve higher overall throughput, you can load-balance multiple Airlock IAM instances behind an Airlock Gateway. Note that this only improves performance if each load-balanced instance runs on its own dedicated hardware. If multiple IAM instances run on the same node and share CPU resources, no throughput gain will be achieved.