Selection and parametrization of hash functions

Affects product

  • Airlock IAM

Configuration

Choosing the right hash algorithms and especially tuning the functions to fit the entropy of the data, the security needs, as well as the performance requirements, are important parts in the configuration of an Airlock IAM system.

Secrets and types of protection requirements

Secrets that have to be checked by a server should always be stored as salted hash values and never as plaintext. There are other approaches like storing secrets or hashes in HSMs or distributed hash databases, but they are not the subject of this section.

Hashing of secrets serves the following goals:

  1. In recent years, numerous databases containing hashed password values have been stolen. In some cases, the attackers inverted all hashes and published a database of plaintext secrets. A good hashing algorithm acts as a last line of defense to hinder access to the underlying secrets.
  2. Authorized humans who have to deal with the stored values, e.g., administrators who operate directly on DB records, should not see the passwords in plaintext so they cannot remember them easily.

The last goal is fulfilled by any hashing function. For further evaluation, we concentrate on the first goal.

Hash functions

A hash function maps strings of arbitrary size to strings of a fixed size. We are interested in so-called cryptographic hash functions, which are designed to be one-way. This means that they are infeasible to invert: The only way to find an input that produces a given output is to try all possible inputs. Ideally, the function should behave like a random function, which implies that it is difficult to find collisions.

There are three classes of these functions:

  • Broken: For a given hash value an attacker can construct another input value that leads to the very same hash value (hash collision). MD5 belongs to that class.
  • Fast: Functions that are constructed for efficiency in time and memory. They are not broken in the sense that it would be possible to willingly construct a hash collision. SHA-256 is an example.
  • Costly: These functions are designed to use relevant effort in CPU time and memory for computation. Argon2id and scrypt are representatives, they were specifically designed for password hashing.

The entropy of the secret

If the secret to be protected has enough entropy that listing all probable values is infeasible, the sheer number of possibilities protects against inversion attacks. This allows for efficient, automated handling of technical keys, such as OAuth or OIDC. Fast hash algorithms are ideal for this scenario.

No hashing method will protect a secret with too little entropy against an inversion attack. An attacker can simply try every possible input. In other words, there is no protection for a poorly chosen password.

Secrets with low to intermediate entropy typically include passwords and similar credentials. Inversion attacks, where an attacker inverts large numbers of hashes and publishes the results, are especially dangerous because users often reuse the same password across multiple accounts. To mitigate these attacks, a costly hash function should be used. It must require as much processing effort as is acceptable for both the user and the service provider, while making large-scale attacks as costly as possible. Airlock IAM supports Argon2id and scrypt and follows the parameter recommendations of the OWASP Password Storage Cheat Sheet.

Recommendations

Recommended hashing functions for specific scenarios

  • Argon2id or scrypt for passwords, IAK letters, and secret questions
  • SHA-256 for OAuth, OIDC, and matrix cards
  •  
    Notice

    Matrix cards typically have low entropy. The inversion of a specific value is feasible - independent of the hashing algorithm. For this reason a service provider has to lock all matrix cards if the hashed values are stolen. Since matrix cards are not reused for other services there is no benefit in inverting and publishing the whole database of secrets.

Migrating to another hash function

The Legacy Hash Functions feature supports migrations from one password-hashing algorithm to another. When an end-user logs in, IAM first verifies the password using the currently configured hash function (in the Hash function plugin). If that check fails, IAM tries each algorithm listed under Legacy Hash Functions until one succeeds. The password is than rehashed using the current hash function, effectively migrating it to the stronger configuration.

 
Functional limitation

Note that it is not possible to use the Legacy Hash Functions feature to migrate existing Argon2id password hashes to Argon2id hashes with other/stronger parameters (e.g., bigger memory size, more iterations, or more lanes). This is because Argon2id hashes are stored in PHC format, which embeds all parameters needed for verification. Since verification always succeeds, regardless of parameter strength, the migration mechanism is never triggered.

However, end-users can still log in with their existing (weaker) Argon2id-hashed password.